Looking at words and points with attention: a benchmark for text-to -shape coherence

Andrea Amaduzzi,Giuseppe Lisanti, Samuele Said,Luigi Di Stefano

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW（2023）

引用 0|浏览16

暂无评分

摘要

While text-conditional 3D object generation and manipulation have seen rapid progress, the evaluation of coherence between generated 3D shapes and input textual descriptions lacks a clear benchmark. The reason is twofold: a) the low quality of the textual descriptions in the only publicly available dataset of text-shape pairs; b) the limited effectiveness of the metrics used to quantitatively assess such coherence. In this paper, we propose a comprehensive solution that addresses both weaknesses. Firstly, we employ large language models to automatically refine textual descriptions associated with shapes. Secondly, we propose a quantitative metric to assess text-to -shape coherence, through cross-attention mechanisms. To validate our approach, we conduct a user study and compare quantitatively our metric with existing ones. The refined dataset, the new metric and a set of text-shape pairs validated by the user study comprise a novel, fine-grained benchmark that we publicly release to foster research on text-to -shape co-herence of text-conditioned 3D generative models.

查看译文

关键词

attention,benchmark,words,text-to-shape

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要