Instilling Type Knowledge in Language Models via Multi-Task QA.

Shuyang Li,Mukund Sridhar,Chandana Satya Prakash,Jin Cao,Wael Hamza,Julian McAuley

The Annual Conference of the North American Chapter of the Association for Computational Linguistics（2022）

引用 8|浏览40

暂无评分

摘要

Understanding human language often necessitates understanding entities and their place in a taxonomy of knowledge -- their types. Previous methods to learn entity types rely on training classifiers on datasets with coarse, noisy, and incomplete labels. We introduce a method to instill fine-grained type knowledge in language models with text-to-text pre-training on type-centric questions leveraging knowledge base documents and knowledge graphs. We create the WikiWiki dataset: entities and passages from 10M Wikipedia articles linked to the Wikidata knowledge graph with 41K types. Models trained on WikiWiki achieve state-of-the-art performance in zero-shot dialog state tracking benchmarks, accurately infer entity types in Wikipedia articles, and can discover new types deemed useful by human judges.

查看译文

关键词

type knowledge,language models,multi-task

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要