Skip to main content

Anthropic’s Claude can now control computers like people do

the claude computer control logo
Anthropic

Anthropic’s already impressive Claude 3.5 Sonnet gains a significant performance boost on Tuesday as the generative AI startup rolls out an enhanced and updated version of the model alongside the new, lightweight Claude 3.5 Haiku. The Sonnet update includes a public beta feature that gives the AI basic control over the computer it’s running on.

Claude 3.5 Sonnet was already a performance leader when it comes to coding tasks, but the new version shows significant across-the-board improvements over its predecessor and steadily outperforms both Gemini 1.5 and GPT-4o on a variety of industry benchmarks. Gemini 1.5 Pro was the only model to best the new 3.5 Sonnet on any test, and did so on the MATH benchmark.

Recommended Videos

The new 3.5 Haiku is no slouch, either, despite its small size. Scheduled to be released later this month, 3.5 Haiku outperforms Claude 3.0 Opus, the company’s largest last generation model. Like its larger version, the new Haiku is exceedingly proficient at coding tasks, scoring 40.6% on the SWE-bench Verified — higher than both GPT-40 and the original 3.5 Sonnet.

new Claude 3.5 sonnet performance chart
Anthropic

Even more impressive, the new Claude 3.5 Sonnet can now interact with desktop apps via the “Computer Use” API. The AI can generate the necessary keystrokes, mouse clicks, and movements needed to emulate the human user. The company is quick to point out that the system is currently quite experimental and prone to errors. The underlying purpose of the public beta release is to elicit feedback from developers to rapidly improve the API’s performance.

“We trained Claude to see what’s happening on a screen and then use the software tools available to carry out tasks,” Anthropic wrote in a blog post. “When a developer tasks Claude with using a piece of computer software and gives it the necessary access, Claude looks at screenshots of what’s visible to the user, then counts how many pixels vertically or horizontally it needs to move a cursor in order to click in the correct place.”

Claude | Computer use for automating operations

It’s an AI agent, essentially. That is, its an AI that can automate other software processes, whether that’s generating and qualifying marketing leads, uncovering patterns and trends in medical data, or simply navigating to a specific website and filling out a form you need. Think of them as a more advanced version of existing Robotic Process Automation systems.

The company cites Asana, Canva, Cognition, DoorDash, Replit, and The Browser Company as early adopters of the new feature. Replit, for example, is using Computer Control to “develop a key feature that evaluates apps as they’re being built for their Replit Agent product,” per the announcement.

There’s no need to worry about the AI going all Skynet on us (yet), as Anthropic explains. “Humans remain in control by providing specific prompts that direct Claude’s actions, like ‘use data from my computer and online to fill out this form,’” an Anthropic spokesperson told TechCrunch. “People enable access and limit access as needed. Claude breaks down the user’s prompts into computer commands (e.g., moving the cursor, clicking, typing) to accomplish that specific task.”

Anthropic also concedes that Computer Control could be misused to generate spam, spread misinformation, or commit fraud. In response, the company has developed new classifiers that identify when the API is being used and whether that use is “causing harm.”

Andrew Tarantola
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
LG’s new Gram Pro finally looks like a serious MacBook Pro rival
An LG Gram laptop on a table.

Just ahead of CES, LG has announced a refresh to its Gram Pro lineup, as well as launched a budget-friendly Gram Book. The tweaked Gram Pro laptops are the most exciting, though, with the the LG Gram Pro 17 catching my eye.

First off, it's been thinned out a bit, dropping down to 0.62 inches thick, which is almost the same thickness as the 16-inch MacBook Pro. The LG Gram Pro 17 is also a full pound and a half lighter than the MacBook Pro, both of which are striving to be one of the best laptops you can buy.

Read more
Nvidia’s new GPUs show up in prebuilts, but the RTX 5090 is missing
iBUYPOWER RTX for AI PCs side view of pre-built on sale hero

Nvidia's upcoming RTX 5080 and RTX 5070 Ti just appeared in several iBUYPOWER gaming PCs. This is the first U.S. retailer to list Nvidia's RTX 50-series in prebuilt systems. The listings are interesting, with performance figures that really don't add up. Still, the biggest question is: Where's the GPU that's bound to beat all the current best graphics cards? Yes, we're talking about RTX 5090.

The listings have already been taken down, but they were preserved by VideoCardz. A total of five systems were listed by iBUYPOWER, but they all contained the same two GPUs -- either the RTX 5080 or the RTX 5070 Ti. Both cards are said to come with 16GB of memory, and we expect them to be announced on January 6 during the CES 2025 keynote held by Nvidia's CEO, Jensen Huang.

Read more
OLED gaming monitors are about to get a lot brighter
Path of Exile 2 running on an Asus gaming monitor.

One of the biggest criticisms leveled against OLED monitors, despite being some of the best gaming monitors you can buy, is how dim they are. Although brightness is steadily increasing, it looks like the next crop of OLED gaming monitors will make quite the leap when it comes to HDR performance. Ahead of CES 2025, VESA has revealed a new tier of its DisplayHDR standard that's focused squarely on the brightness of OLED monitors.

The certification is DisplayHDR True Black 1,000. Most OLED gaming monitors, such as the MSI MPG 321URX or Alienware 27 QD-OLED, are certified with DisplayHDR True Black 400. This certification level is reserved for OLED -- or extremely high-end mini-LED -- displays that achieve nearly perfect black levels. According to VESA's specifications, the display has to reach 0.0005 nits with a checkboard pattern. Now, VESA is focusing on the other end of the spectrum, adding a more demanding tier that maintains those low black levels while pushing brightness higher.

Read more