Supporting Human-AI Collaboration in Auditing LLMs with LLMs

PROCEEDINGS OF THE 2023 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2023(2023)

引用 11|浏览117
暂无评分
摘要
Large language models (LLMs) are increasingly becoming all-powerful and pervasive via deployment in sociotechnical systems. Yet these language models, be it for classification or generation, have been shown to be biased, behave irresponsibly, causing harm to people at scale. It is crucial to audit these language models rigorously before deployment. Existing auditing tools use either or both humans and AI to find failures. In this work, we draw upon literature in human-AI collaboration and sensemaking, and interview research experts in safe and fair AI, to build upon the auditing tool: AdaTest [36], which is powered by a generative LLM. Through the design process we highlight the importance of sensemaking and human-AI communication to leverage complementary strengths of humans and generative models in collaborative auditing. To evaluate the effectiveness of AdaTest++, the augmented tool, we conduct user studies with participants auditing two commercial language models: OpenAI's GPT-3 and Azure's sentiment analysis model. Qualitative analysis shows that AdaTest++ effectively leverages human strengths such as schematization, hypothesis testing. Further, with our tool, users identified a variety of failures modes, covering 26 different topics over 2 tasks, that have been shown in formal audits and also those previously under-reported.
更多
查看译文
关键词
language models,generative models,auditing,biases
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要