Random Silicon Sampling: Simulating Human Sub-Population Opinion Using a Large Language Model Based on Group-Level Demographic Information
CoRR(2024)
摘要
Large language models exhibit societal biases associated with demographic
information, including race, gender, and others. Endowing such language models
with personalities based on demographic data can enable generating opinions
that align with those of humans. Building on this idea, we propose "random
silicon sampling," a method to emulate the opinions of the human population
sub-group. Our study analyzed 1) a language model that generates the survey
responses that correspond with a human group based solely on its demographic
distribution and 2) the applicability of our methodology across various
demographic subgroups and thematic questions. Through random silicon sampling
and using only group-level demographic information, we discovered that language
models can generate response distributions that are remarkably similar to the
actual U.S. public opinion polls. Moreover, we found that the replicability of
language models varies depending on the demographic group and topic of the
question, and this can be attributed to inherent societal biases in the models.
Our findings demonstrate the feasibility of mirroring a group's opinion using
only demographic distribution and elucidate the effect of social biases in
language models on such simulations.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要