EKGRL: Entity-Based Knowledge Graph Representation Learning for Fact-Based Visual Question Answering

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI(2024)

引用 0|浏览2
暂无评分
摘要
Fact-based Visual Question Answering (FVQA) is a task aiming at answering question based on given image and external knowledge associated with it. The reasoning abilities of current FVQA models including query-based and joint learning methods are insufficient. To achieve stronger reasoning ability, we propose an entity-based knowledge graph representation learning (EKGRL) method. Our model achieves state-of-the-art performance on FVQA dataset. Furthermore, we build a psychological fact-based VQA dataset (PFVQA) containing 6129 questions from six different types, which is, as far as we know, the first VQA dataset built on psychological knowledge. We demonstrate that EKGRL continues to achieve state-of-the-art performance on PFVQA, showing the ability to maintain a good performance on reasoning and knowledge representation based on external knowledge from both commonsense and psychological domains.
更多
查看译文
关键词
FVQA,Representation Learning,Psychological Knowledge
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要