Maximum Entropy Word Segmentation of Chinese Text

SIGHAN@COLING/ACL(2006)

引用 25|浏览29
暂无评分
摘要
We extended the work of Low, Ng, and Guo (2005) to create a Chinese word seg- mentation system based upon a maximum entropy statistical model. This system was entered into the Third International ChineseLanguage ProcessingBakeoff and evaluated on all four corpora in their re- spective open tracks. Our system achieved the highest F-score for the UPUC corpus, and the second, third, and seventh high- est for CKIP, CITYU, and MSRA respec- tively. Later testing with the gold-standard data revealed that while the additions we made to Low et al.'s system helped our re- sults for the 2005 data with which we ex- perimented during development, a number of them actually hurt our scores for this year's corpora.
更多
查看译文
关键词
gold standard,maximum entropy,statistical model,word segmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要