Privacy concerns have swirled around smart home technology since the introduction of the first Amazon Echo, but Google and Amazon have so far managed to assuage consumer’s fears that their private lives were out in the open for people to see. That might have changed today with a report from the Belgian public broadcasting network VRT NWS that revealed third-party contractors are used to transcribe recordings of Google Assistant commands, some of which contain highly personal information.
Google and Amazon both have been upfront that commands given to their smart assistants are recorded and analyzed to help improve the natural speech of the devices. To their credit, most recordings the company has are mundane:
- “Hey Google, what’s the weather?”
- “Hey Google, how’s my commute?”
- “Hey Google, play me some smooth jazz.”
On the other hand, how many of us have received a response from Google when we asked no questions? These accidental activations are the crux of the problem.
Google operates in dozens of countries across the globe, each with its own language. In order for Google Assistant to understand and respond in a natural way to each of these languages, the software uses machine learning algorithms to improve its responses. According to VRT NWS, however, easily parsed transcriptions are also used. Google contracts this transcription work to outside companies that log into Google’s online tool Crowdsource.
VRT NWS claims they listened to more than 1,000 recordings. Of those, 153 “were conversations that should never have been recorded” and in which “the command ‘Okay Google’ was clearly not given.” That’s roughly 15% of all recordings.
The issue boils down to the personal information in the recordings, not the recordings themselves. Analyzing vocal commands and training the software to recognize spoken words with better clarity is a natural part of improving voice assistants, but Google (and any other company that uses voice assistants) should have strict guidelines in place for the handling and deleting of personal information.
In time, even these guidelines should be unnecessary. The software that listens for the activation phrase has to be refined to the point where it no longer listens to conversations without the command word.