Coarse Is Better? A New Pipeline Towards Self-Supervised Learning with Uncurated Images

CoRR(2023)

引用 0|浏览37
暂无评分
摘要
Most self-supervised learning (SSL) methods often work on curated datasets where the object-centric assumption holds. This assumption breaks down in uncurated scene images. Existing scene image SSL methods try to find the two views from original scene images that are well matched or dense, which is both complex and computationally heavy. This paper proposes a conceptually different pipeline: first find regions that are coarse objects (with adequate objectness), crop them out as pseudo object-centric images, then any SSL method can be directly applied as in a real object-centric dataset. That is, cropping benefits scene images SSL. A novel cropping strategy is proposed to find coarse object regions in scene images. The proposed pipeline and cropping strategy successfully learn high quality visual representation from uncurated scene datasets like MS-COCO, and the learning does not rely on external object-centric datasets such as ImageNet. Experiments show that our pipeline outperforms existing SSL methods on scene images, and is friendly for non-contrastive SSL methods like MAE. Ablations further verify that the proposed cropping strategy does not rely on pretrained SSL models, too.
更多
查看译文
关键词
learning,images,new pipeline,self-supervised
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要