Large Language Models Help Humans Verify Truthfulness – Except When They Are Convincingly Wrong
arxiv(2023)
摘要
Large Language Models (LLMs) are increasingly used for accessing information
on the web. Their truthfulness and factuality are thus of great interest. To
help users make the right decisions about the information they get, LLMs should
not only provide information but also help users fact-check it. Our experiments
with 80 crowdworkers compare language models with search engines (information
retrieval systems) at facilitating fact-checking. We prompt LLMs to validate a
given claim and provide corresponding explanations. Users reading LLM
explanations are significantly more efficient than those using search engines
while achieving similar accuracy. However, they over-rely on the LLMs when the
explanation is wrong. To reduce over-reliance on LLMs, we ask LLMs to provide
contrastive information - explain both why the claim is true and false, and
then we present both sides of the explanation to users. This contrastive
explanation mitigates users' over-reliance on LLMs, but cannot significantly
outperform search engines. Further, showing both search engine results and LLM
explanations offers no complementary benefits compared to search engines alone.
Taken together, our study highlights that natural language explanations by LLMs
may not be a reliable replacement for reading the retrieved passages,
especially in high-stakes settings where over-relying on wrong AI explanations
could lead to critical consequences.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要