Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit's Showerthoughts
arxiv(2024)
摘要
Recent Large Language Models (LLMs) have shown the ability to generate
content that is difficult or impossible to distinguish from human writing. We
investigate the ability of differently-sized LLMs to replicate human writing
style in short, creative texts in the domain of Showerthoughts, thoughts that
may occur during mundane activities. We compare GPT-2 and GPT-Neo fine-tuned on
Reddit data as well as GPT-3.5 invoked in a zero-shot manner, against
human-authored texts. We measure human preference on the texts across the
specific dimensions that account for the quality of creative, witty texts.
Additionally, we compare the ability of humans versus fine-tuned RoBERTa
classifiers to detect AI-generated texts. We conclude that human evaluators
rate the generated texts slightly worse on average regarding their creative
quality, but they are unable to reliably distinguish between human-written and
AI-generated texts. We further provide a dataset for creative, witty text
generation based on Reddit Showerthoughts posts.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要