Controlled Training Data Generation with Diffusion Models
CoRR(2024)
摘要
In this work, we present a method to control a text-to-image generative model
to produce training data specifically "useful" for supervised learning. Unlike
previous works that employ an open-loop approach and pre-define prompts to
generate new data using either a language model or human expertise, we develop
an automated closed-loop system which involves two feedback mechanisms. The
first mechanism uses feedback from a given supervised model and finds
adversarial prompts that result in image generations that maximize the model
loss. While these adversarial prompts result in diverse data informed by the
model, they are not informed of the target distribution, which can be
inefficient. Therefore, we introduce the second feedback mechanism that guides
the generation process towards a certain target distribution. We call the
method combining these two mechanisms Guided Adversarial Prompts. We perform
our evaluations on different tasks, datasets and architectures, with different
types of distribution shifts (spuriously correlated data, unseen domains) and
demonstrate the efficiency of the proposed feedback mechanisms compared to
open-loop approaches.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要