Whether they’re something as serious as bomb threats or an apparent “joke” like a fake call to the Coast Guard, every year law enforcement agencies spend billions of dollars on hoax calls.
Because of the potential seriousness of these calls, agencies are compelled to respond — which often means deploying personnel and wasting valuable resources. This also has a negative impact on the general public, since it means taking away resources from genuine emergencies which could otherwise have been attended to.
Up until now law enforcement agencies have had few ways of knowing whether a potential hoax call was legitimate or not, or of identifying an anonymous caller. That’s changing thanks to the work of researchers like Rita Singh, a speech scientist at Carnegie Mellon. Singh’s research focuses on using speech recognition algorithms to extract seemingly impossible amounts of detail from callers.
According to Singh (and the law enforcement agencies she works with) this data can include everything from a person’s gender and age to their height, weight, place of birth, ethnicity, level of intoxication, emotional state, and possible drug taking. Going further than that, other algorithms can reveal the possible facial structure of callers, and even precise details about their physical environment — such as the material of the walls and ceiling, any carpet, surrounding objects, and more.
“Not everything is present in all voice samples, but a smattering of [them] usually are, and we are able to find them accurately enough to send the law enforcement looking in the right directions,” Singh told Digital Trends.
While we hear a lot about biometrics like fingerprints, Singh explained that voice is perhaps the most important biomarker of all — since it can reveal so many details of a a person.
“When people commit crimes through voice, they don’t realize this kind of technology exists,” she continued. “People will try to disguise their voice, particularly if they’re a repeat offender. What we’re working on is coming up with the right parameters so that we can tell law enforcement which aspects of people’s voice they are able to disguise and which aspects cannot be changed because they’re not under the speaker’s voluntary control. It’s these second parts that we focus on because it’s what gives us the most useful, accurate information.”
The work was originally developed by the Coast Guard Investigative Agency, and Singh explained that it is funded by the Department of Homeland Security. Not all of the applications are about criminal cases, however. For instance, earlier this year Singh lent her analysis of vocal micro-features to determining whether Donald Trump had posed as public-relations man John Miller in a 1991 phone interview, which he publicly denied. (According to Singh, based on recordings of Trump at the time, it was indeed him.)
As impressive as this work is, however, Singh told Digital Trends that it is not yet at the point of being fully automated.
“When I get a voice print from a crime scene, it takes me more than a week to extract all of the information I can squeeze out of it,” she said. “There are plenty of algorithms that we have at our disposal, but you have to choose the right one. There isn’t a process in place that will choose the right ones for you. It’s going to take a few years still. But it’s already a very powerful tool.”