A differentially private framework for gaining insights into AI chatbot use – Google Research

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
We strive to create an environment conducive to many different types of research across many different time scales and levels of risk.
Our researchers drive advancements in computer science through both fundamental and applied research.
We regularly open-source projects with the broader research community and apply our developments to Google products.
Publishing our work allows us to share ideas and work collaboratively to advance the field of computer science.
We make products, tools, and datasets available to everyone with the goal of building a more collaborative ecosystem.
Supporting the next generation of researchers through a wide range of programming.
Participating in the academic research community through meaningful engagement with university faculty.
Connecting with the broader research community through events is essential for creating progress in every aspect of our work.
December 10, 2025
Alexander Knop, Research Scientist, and Daogao Liu, Post Doc Researcher, Google Research
Introducing a novel framework that generates high-level insights into AI chatbot usage through a pipeline of DP clustering, DP keyword extraction, and LLM summarization. This approach provides rigorous, end-to-end DP guarantees, ensuring user conversation privacy while offering utility for platform improvement.
Large language model (LLM) chatbots are used by hundreds of millions of people daily for tasks ranging from drafting emails and writing code to planning vacations and creating menus for cafes. Understanding these high-level use cases is incredibly valuable for platform providers looking to improve services or enforce safety policies. It also offers the public insights into how AI is shaping our world.
But this raises a critical question: How can we gain valuable insights when the conversations themselves might contain private or sensitive information?
Existing approaches, like the CLIO framework, attempt to solve this by using an LLM to summarize conversations while prompting it to strip out personally identifiable information (PII). While a good first step, this method relies on heuristic privacy protections. The resulting privacy guarantee is difficult to formalize and may not hold up as models evolve, making these systems difficult to maintain and audit. This limitation led us to ask if it is possible to achieve similar utility with formal, end-to-end privacy guarantees.
In our paper, “Urania: Differentially Private Insights into AI Use,” presented at COLM 2025, we introduce a new framework that generates insights from LLM chatbot interactions with rigorous differential privacy (DP) guarantees. This framework uses a DP clustering algorithm and keyword extraction method to ensure that no single conversation overly influences the result (i.e., the output summaries do not reveal information about any single individual’s conversation). Here we explain the algorithm and demonstrate that this framework is indeed providing better privacy guarantees than prior solutions.
Gemini-generated image showing schematically how the algorithm works for one cluster of conversations.
DP uses a privacy budget parameter, ε, to measure the maximum allowed influence of any single user’s contributions to the final output of a model. Our framework is designed to rely on two key properties of DP:
This differentially private pipeline is designed to ensure end-to-end user data protection through the following stages:
The framework’s data flow. Yellow nodes denote non-DP data, green nodes represent operations that are either DP or per conversation, light blue nodes denote private data, and dark blue nodes represent non-private operations.
By integrating DP at its core, this framework’s privacy guarantees are mathematical, not heuristic. They don’t depend on an LLM’s ability to perfectly redact private data: in other words, even if keywords contain PII or some other sensitive data, the generated summaries are not going to contain this data. In more practical terms, this guarantee makes it impossible for the LLM to reveal sensitive data (e.g., due to prompt injection attacks).
To evaluate our framework’s utility (summary quality) and privacy (protection strength), we compared its performance against Simple-CLIO, a non-private baseline we created inspired by CLIO. The baseline follows a two-step process:
As expected, we observed a trade-off: stronger privacy settings (lower values of the privacy parameter ϵ) led to a decrease in the granularity of the summaries. For instance, topic coverage dropped as the privacy budget tightened, because the DP clustering algorithm produced fewer and less precise clusters.
However, the results also held a surprise. In head-to-head comparisons, LLM evaluators often preferred the private summaries generated by our framework. In one evaluation, the DP-generated summaries were favored up to 70% of the time. This suggests that the constraints imposed by this DP pipeline — forcing summaries to be based on general, frequent keywords — can lead to outputs that are more concise and focused than those from an unconstrained, non-private approach.
To test the framework’s robustness, we ran a membership inference-style attack designed to identify whether a specific sensitive conversation was included in the dataset. The results were clear: the attack on the DP pipeline performed about as well as random guessing, achieving an area under the curve (AUC) score of 0.53 (i.e., the integral of the ROC curve). In contrast, the attack was more successful against the non-private pipeline, which had a higher AUC of 0.58, indicating greater information leakage. This experiment provides empirical evidence that our privacy framework offers significantly stronger protection against privacy leakage.
The ROC curve for DP pipeline shows performance close to random guessing (AUC = 0.53), demonstrating its robustness.
The ROC curve for the non-private pipeline is more vulnerable (AUC = 0.58).
Our work is a first step toward building systems that can analyze large-scale text corpora with formal privacy guarantees. We’ve shown that it’s possible to balance the need for meaningful insights with stringent user privacy.
Looking forward, we see several exciting avenues for future research. These include adapting the framework for online settings where new conversations are constantly added, exploring alternative DP mechanisms to further improve the utility-privacy trade-off, and adding support for multi-modal conversations (i.e., conversations involving images, videos, and audio).
As AI becomes more integrated into our daily lives, developing privacy-preserving methods for understanding its use is not just a technical challenge — it’s a fundamental requirement for building trustworthy and responsible AI.
Thanks to all project contributors, whose essential efforts were pivotal to its success. Special thanks to our colleagues: Yaniv Carmel, Edith Cohen, Rudrajit Das, Chris Dibak, Vadym Doroshenko, Alessandro Epasto, Prem Eruvbetine, Dem Gerolemou, Badih Ghazi, Miguel Guevara, Steve He, Peter Kairouz, Pritish Kamath, Nir Kerem, Ravi Kumar, Ethan Leeman, Pasin Manurangsi, Shlomi Pasternak, Mikhail Pravilov, Adam Sealfon, Yurii Sushko, Da Yu, Chiyuan Zhang.
December 4, 2025
November 18, 2025
November 13, 2025
Follow us

source

Scroll to Top