Enhance Image Classification Via Inter-Class Image Mixup With Diffusion Model
CVPR 2024(2024)
摘要
Text-to-image (T2I) generative models have recently emerged as a powerful
tool, enabling the creation of photo-realistic images and giving rise to a
multitude of applications. However, the effective integration of T2I models
into fundamental image classification tasks remains an open question. A
prevalent strategy to bolster image classification performance is through
augmenting the training set with synthetic images generated by T2I models. In
this study, we scrutinize the shortcomings of both current generative and
conventional data augmentation techniques. Our analysis reveals that these
methods struggle to produce images that are both faithful (in terms of
foreground objects) and diverse (in terms of background contexts) for
domain-specific concepts. To tackle this challenge, we introduce an innovative
inter-class data augmentation method known as Diff-Mix
(https://github.com/Zhicaiwww/Diff-Mix), which enriches the dataset by
performing image translations between classes. Our empirical results
demonstrate that Diff-Mix achieves a better balance between faithfulness and
diversity, leading to a marked improvement in performance across diverse image
classification scenarios, including few-shot, conventional, and long-tail
classifications for domain-specific datasets.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要