Update: Included Google’s response to Bankston’s thread at the bottom of the post.
A troubling discovery was made late last week that call into question what all Google Gemini can and can’t see. Kevin Bankston, the senior adviser on AI governance at the Center for Democracy and Technology, found that Gemini was able to automatically summarize his private tax returns that he’d viewed in Google Docs and posted about his findings on X.
Just pulled up my tax return in @Google Docs–and unbidden, Gemini summarized it. So…Gemini is automatically ingesting even the private docs I open in Google Docs? WTF, guys. I didn't ask for this. Now I have to go find new settings I was never told about to turn this crap off.
— Kevin Bankston (@KevinBankston) July 10, 2024
This is something that, in theory, the AI assistant very much shouldn’t be able to do without express authorization from the user. His search for the privacy setting that would disable this behavior only led to even more concerning issues about what generative AI systems ingest and how.
Bankston initially spent 15 minutes quizzing the AI itself for directions to the necessary setting menu, but to no avail. The system would only give him information on how to manage his Gemini chat history. What’s more, neither of the settings suggestions that the system did offer actually resolved Bankston’s issue, and when he did find the option to disable summarizations in Google Workspace, it was in an entirely different menu than what Gemini told him. Per the AI itself, those settings should be openly available to users. So, given that they aren’t, Bankston argues that the AI is either “hallucinating (lying)” or something within Google’s servers is not operating as it should.
While he was subsequently directed toward the Gemini Workspace privacy commitments page, he wondered, “what if I still don’t want it looking at my docs unprompted? I didn’t *ask* it to summarize my taxes, it just did. It should be up to me whether/which private docs prompt the model.” Bankston also notes that users need to pay for a $20/month AI Premium subscription to enjoy expanded commitments regarding how their personal data will be protected.
This isn’t the first time that Google’s AI products have suffered data leaks. In September 2023, Gemini’s precursor, Bard, accidentally revealed user chat sessions in public search results. Google has even warned its own employees about entering sensitive data into its chatbots to prevent unintentional leaks. The company was also sued last July over allegations that its scraping of the public internet for AI training data violated their privacy and property rights.
Eventually, Bankston was able to troubleshoot the issue and identify the root issue. “It seems that if you’ve ever clicked the Gemini button for a type of document then it remains open whenever you open another of that type–and therefore automatically ingests and summarizes it,” he wrote.
So, because he summarized a different PDF using Gemini during the chat, the system appears to have granted itself access to all PDFs opened throughout the session. “Same with GDocs–it wasn’t on in any of my Docs,” he also noted, “then I turned it on in one, and now it auto-summarizes any I open.”
Regardless of the reasons behind the glitch, this sort of behavior from the AI system has significant privacy implications for users. As Bankston argues, “how many people have unwittingly inputted how many more private docs into Gemini simply because they clicked on that little AI star once in one document?”
While the access to additional documents on which to refine its responses would help improve performance, doing so without transparency and the permission of the content owners will only further erode the public’s already slim trust in AI.
Google disagreed with multiple aspects of Bankston’s experience, including that data ingestion is happening at all. A Google spokesperson mentioned that content form an open document can be used to generate a summary in real time, but only if the Gemini feature is enabled and that neither that summary and the doc itself aren’t saved in any way. The following is the official statement from Google:
“Our generative AI features are designed to give users choice and keep them in control of their data. Using Gemini in Google Workspace requires a user to proactively enable it, and when they do their content is used in a privacy-preserving manner to generate useful responses to their prompts, but is not otherwise stored without permission.”