A Little Leak Will Sink a Great Ship: Survey of Transparency for Large Language Models from Start to Finish
CoRR(2024)
摘要
Large Language Models (LLMs) are trained on massive web-crawled corpora. This
poses risks of leakage, including personal information, copyrighted texts, and
benchmark datasets. Such leakage leads to undermining human trust in AI due to
potential unauthorized generation of content or overestimation of performance.
We establish the following three criteria concerning the leakage issues: (1)
leakage rate: the proportion of leaked data in training data, (2) output rate:
the ease of generating leaked data, and (3) detection rate: the detection
performance of leaked versus non-leaked data. Despite the leakage rate being
the origin of data leakage issues, it is not understood how it affects the
output rate and detection rate. In this paper, we conduct an experimental
survey to elucidate the relationship between the leakage rate and both the
output rate and detection rate for personal information, copyrighted texts, and
benchmark data. Additionally, we propose a self-detection approach that uses
few-shot learning in which LLMs detect whether instances are present or absent
in their training data, in contrast to previous methods that do not employ
explicit learning. To explore the ease of generating leaked information, we
create a dataset of prompts designed to elicit personal information,
copyrighted text, and benchmarks from LLMs. Our experiments reveal that LLMs
produce leaked information in most cases despite less such data in their
training set. This indicates even small amounts of leaked data can greatly
affect outputs. Our self-detection method showed superior performance compared
to existing detection methods.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要