Skip to main content

5 ways that future A.I. assistants will take voice tech to the next level

Apple

Since Siri debuted on the iPhone 4s back in 2011, voice assistants have gone from unworkable gimmick to the basis for smart speaker technology found in one in six American homes.

“Before Siri, when I talked about [what I do] there were blank stares,” Tom Hebner, head of innovation at Nuance Communications, which develops cutting edge A.I. voice technology, told Digital Trends. “People would say, ‘Do you build those horrible phone systems? I hate you.’ That was one group of people’s only interaction with voice technology.”

Recommended Videos

That’s no longer the case today. According to eMarketer forecasts, almost 100 million smartphone users will be using voice assistants by 2020. But while A.I. assistants are no longer a novelty, we’re still at the start of their evolution. There’s a long way to go before they fully live up to the promise that voice assistants have as a product category.

Here are five ways in which the technology could improve to make it smarter and more efficient — and help us lead more productive lives as a result. Call them “predictions” or a “wishlist,” these are the challenges that need to be solved.

Mo’ knowledge, less problems

Alexa can tell you what the weather is in Kuala Lumpur, Malaysia; the total number of U.S. dollars you’ll get for 720 South African Rand, and how to spell “disestablishmentarianism.” But consumer A.I. assistants are, in essence, the digital equivalent of a person with a complete set of up-to-date encyclopedias. You get (hopefully) the right information, but there’s no pro-grade level of expertise there.

“The challenge that the systems in your home have is that there’s such a broad range of things that they’re trying to do,” Hebner told Digital Trends.

Image used with permission by copyright holder

This is a tough one to solve, but doing so would be a game-changer. Nuance develops many specialist systems aimed at one specific use-case, such as helping airline customers answer queries or doctors to take notes. Doing so not only means these systems can drill down to get more detailed information, but also means that more intelligence can get baked in. “People were very excited about computers that could understand words, but that doesn’t necessarily matter if you don’t know what to do with those words,” Hebner said.

One example he gives is of a Nuance system that not only understands when doctors read out a list of potential drugs for patients, but could call out potential conflicts. This is way beyond the capabilities of most user-grade A.I. assistants.

However, having a more specialist detailed knowledge of different domains — something hinted at by Alexa Skills — could be transformative. Asking your smart speaker for legal or medical advice sounds, on the face of it, crazy. But there have been extraordinary advances in fields like legal bots, while a recently published report suggests Apple wants Siri to be able to have health-focused conversations with users by 2021.

Specialist knowledge graphs for A.I. assistants are the stuff of sci-fi dreams right now, although a recent Voicebot.ai report shows just how rapidly virtual assistants’ skillsets are expanding. When skills move into the terrain of specialities, though, we’re going to be in for a treat!

More (and better) personalization

Personalization of today’s smart speakers is still in its infancy. You can change voice assistants’ accent and presenting gender, add or remove skills, and feed it bits of information like your name and place of work. In some cases, you can set up multiple voice profiles so that Google Home will recognize the individual members of your household.

Amazon Echo Show
Image used with permission by copyright holder

But there’s still a long way to go — although the juice should be worth the squeeze. Mattersight Corporation has developed A.I. call center technology, called Predictive Behavioral Routing, which analyzes the speech patterns of callers and matches them up with human operatives with compatible personality types. According to the company, matching a person with a compatible personality will result in a successful call that lasts just half the time, next to that of a person with a conflicting personality type.

Using a similar approach could result in A.I. assistants which talk back to you the way you like to be addressed. That could be something as simple as matching the accent and voice volume of the person they’re speaking with. Or it could change the way it addresses ideas by perhaps using more emotive words for some users, compared to more dense detailed information it could use for others. Maybe some people want a voice assistant to chat to at length, while others simply want one to convey the necessary information in the most concise manner possible. A.I. assistants should be capable of both.

Technologies like Google Duplex show just how convincingly accurate A.I.-generated synthesized voices and conversations are getting. As A.I.s move into areas more complex than dishing up song requests and food timers, expect to see this technology to play a major role.

This could be aided by breakthroughs in the ability to identify users by voice. Hebner notes that Nuance’s technology can ID users from just a single solitary second of audio. “It used to take 10 seconds to understand who you are, to get an accurate signal,” he said. “The power of that is significant.” Being able to identify users by a small snippet of voice solves the password problem, and opens up the opportunity to use voice assistants for more delicate confidential information.

Getting proactive

A good assistant will do something when you ask them to. A great assistant won’t need asking. Right now, A.I. assistants are still at this first stage. Users can get the song they want or the reminder they need, but typically only when it’s been explicitly requested. As people get more comfortable with voice assistants, there’s a great opportunity for them to move beyond being purely reactive devices to proactive ones.

There are big questions about whether or not people want to hand certain jobs over to machines.

How would you feel about an A.I. assistant making decisions on your behalf? These could be anything from cranking up the thermostat when someone says they’re cold or rebooking a lunch meeting because you’re running late, to nudging you to do more exercise or get better at saving your paycheck. As more and more smart devices make their way into the home, the number of things a voice assistant could conceivably command will greatly increase.

Part of this is a social question about how comfortable people are about machines making decisions on their part. There are big questions about whether or not people want to hand certain jobs over to machines. Think of it like giving your credit card and house keys to your flesh-and-blood assistant — only with a much bigger sprinkling of Skynet. The downside is giving up a certain amount of control. The potential upside is increasing your free time. Of course, there is a big technical challenge…

It’s all about the feedback

Tom Hebner pointed out a big challenge with the issue of proactivity: how do our machines know when they’ve got it right? Returning to the idea of the good vs. great assistant, a great assistant might have all your files out ahead of a big meeting, without you needing to ask. But what if they’re the wrong files? A big issue with making home A.I. assistants more proactive is that there are currently limited ways of revealing whether or not we’re getting the information is the right information.

A.I. is good pepper the robot
Tomohiro Ohsumi/Getty Images

“If I ask for the same song every day when I walk into my house, and then day I walk in and it just starts playing, how do they know that they got it right?” Hebner said. “If I don’t stop it playing, does that mean it’s right? If I do say ‘stop,’ does that mean it got it wrong and it should never do it again? The feedback mechanism is one of the reasons you’re not getting more proactive systems.”

This is a challenging one for engineers to figure out. Anyone who’s ever had an intern asking them for instruction and feedback on every single task knows that sometimes it’s easier to do a job yourself than delegate it. An A.I. assistant is there to make your life more frictionless; not to give you dozens of mini surveys each day to confirm if it’s done its job right. This will need to be solved in a way that’s not crippling to the user friendliness of these devices, and doesn’t require a whole lot of training up front before systems learn your preferences.

What’s the answer? I’m not sure. But, as Steve Jobs once said, it’s not the job of the customer to figure it out.

New interaction methods

There’s a scene in 2001: A Space Odyssey in which the murderous HAL 9000, disconcertingly still the most famous fictional A.I. assistant in history, reveals that it doesn’t just use microphones to determine what is being said to it. When two crew members try and choose a location to speak where they know HAL can’t hear, HAL reveals that he can still understand them, based on reading their lip movement.

2001: A Space Odyssey Image used with permission by copyright holder

Scary moment of the movie? Sure. An example of how A.I. assistants could work in the future? Um, sure!

The idea that voice assistants should be limited to voice diminishes the possible number of ways they could usefully interact with us. With the rise of facial recognition and emotion-tracking technologies, an ever-growing number of biometrics gathered about users on a constant basis, and even the possibility of mind-reading tech on the horizon, there are plenty of different signals which could be used by A.I. assistants to draw their conclusions.

The idea that, 10 years from now, we’ll only be using voice to control these A.I. assistants is like looking at PCs in the early 80s and thinking we’ll never have more than a keyboard at our disposal.

Luke Dormehl
Former Digital Trends Contributor
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
Range Rover’s first electric SUV has 48,000 pre-orders
Land Rover Range Rover Velar SVAutobiography Dynamic Edition

Range Rover, the brand made famous for its British-styled, luxury, all-terrain SUVs, is keen to show it means business about going electric.

And, according to the most recent investor presentation by parent company JLR, that’s all because Range Rover fans are showing the way. Not only was demand for Range Rover’s hybrid vehicles up 29% in the last six months, but customers are buying hybrids “as a stepping stone towards battery electric vehicles,” the company says.

Read more
BYD’s cheap EVs might remain out of Canada too
BYD Han

With Chinese-made electric vehicles facing stiff tariffs in both Europe and America, a stirring question for EV drivers has started to arise: Can the race to make EVs more affordable continue if the world leader is kept out of the race?

China’s BYD, recognized as a global leader in terms of affordability, had to backtrack on plans to reach the U.S. market after the Biden administration in May imposed 100% tariffs on EVs made in China.

Read more
Tesla posts exaggerate self-driving capacity, safety regulators say
Beta of Tesla's FSD in a car.

The National Highway Traffic Safety Administration (NHTSA) is concerned that Tesla’s use of social media and its website makes false promises about the automaker’s full-self driving (FSD) software.
The warning dates back from May, but was made public in an email to Tesla released on November 8.
The NHTSA opened an investigation in October into 2.4 million Tesla vehicles equipped with the FSD software, following three reported collisions and a fatal crash. The investigation centers on FSD’s ability to perform in “relatively common” reduced visibility conditions, such as sun glare, fog, and airborne dust.
In these instances, it appears that “the driver may not be aware that he or she is responsible” to make appropriate operational selections, or “fully understand” the nuances of the system, NHTSA said.
Meanwhile, “Tesla’s X (Twitter) account has reposted or endorsed postings that exhibit disengaged driver behavior,” Gregory Magno, the NHTSA’s vehicle defects chief investigator, wrote to Tesla in an email.
The postings, which included reposted YouTube videos, may encourage viewers to see FSD-supervised as a “Robotaxi” instead of a partially automated, driver-assist system that requires “persistent attention and intermittent intervention by the driver,” Magno said.
In one of a number of Tesla posts on X, the social media platform owned by Tesla CEO Elon Musk, a driver was seen using FSD to reach a hospital while undergoing a heart attack. In another post, a driver said he had used FSD for a 50-minute ride home. Meanwhile, third-party comments on the posts promoted the advantages of using FSD while under the influence of alcohol or when tired, NHTSA said.
Tesla’s official website also promotes conflicting messaging on the capabilities of the FSD software, the regulator said.
NHTSA has requested that Tesla revisit its communications to ensure its messaging remains consistent with FSD’s approved instructions, namely that the software provides only a driver assist/support system requiring drivers to remain vigilant and maintain constant readiness to intervene in driving.
Tesla last month unveiled the Cybercab, an autonomous-driving EV with no steering wheel or pedals. The vehicle has been promoted as a robotaxi, a self-driving vehicle operated as part of a ride-paying service, such as the one already offered by Alphabet-owned Waymo.
But Tesla’s self-driving technology has remained under the scrutiny of regulators. FSD relies on multiple onboard cameras to feed machine-learning models that, in turn, help the car make decisions based on what it sees.
Meanwhile, Waymo’s technology relies on premapped roads, sensors, cameras, radar, and lidar (a laser-light radar), which might be very costly, but has met the approval of safety regulators.

Read more