Membership Inference Attacks and Privacy in Topic Modeling
CoRR(2024)
摘要
Recent research shows that large language models are susceptible to privacy
attacks that infer aspects of the training data. However, it is unclear if
simpler generative models, like topic models, share similar vulnerabilities. In
this work, we propose an attack against topic models that can confidently
identify members of the training data in Latent Dirichlet Allocation. Our
results suggest that the privacy risks associated with generative modeling are
not restricted to large neural models. Additionally, to mitigate these
vulnerabilities, we explore differentially private (DP) topic modeling. We
propose a framework for private topic modeling that incorporates DP vocabulary
selection as a pre-processing step, and show that it improves privacy while
having limited effects on practical utility.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要