Capitalization cues improve dependency grammar induction

WILS '12: Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure(2012)

引用 3|浏览81
暂无评分
摘要
We show that orthographic cues can be helpful for unsupervised parsing. In the Penn Treebank, transitions between upper- and lower-case tokens tend to align with the boundaries of base (English) noun phrases. Such signals can be used as partial bracketing constraints to train a grammar inducer: in our experiments, directed dependency accuracy increased by 2.2% (average over 14 languages having case information). Combining capitalization with punctuation-induced constraints in inference further improved parsing performance, attaining state-of-the-art levels for many languages.
更多
查看译文
关键词
improved parsing performance,unsupervised parsing,Combining capitalization,Penn Treebank,case information,dependency accuracy,grammar inducer,lower-case token,noun phrase,orthographic cue,capitalization cue,dependency grammar induction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要