ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition
CoRR(2024)
摘要
In this work, we revisit the problem of semi-supervised named entity
recognition (NER) focusing on extremely light supervision, consisting of a
lexicon containing only 10 examples per class. We introduce ELLEN, a simple,
fully modular, neuro-symbolic method that blends fine-tuned language models
with linguistic rules. These rules include insights such as ”One Sense Per
Discourse”, using a Masked Language Model as an unsupervised NER, leveraging
part-of-speech tags to identify and eliminate unlabeled entities as false
negatives, and other intuitions about classifier confidence scores in local and
global context. ELLEN achieves very strong performance on the CoNLL-2003
dataset when using the minimal supervision from the lexicon above. It also
outperforms most existing (and considerably more complex) semi-supervised NER
methods under the same supervision settings commonly used in the literature
(i.e., 5
zero-shot scenario on WNUT-17 where we find that it outperforms GPT-3.5 and
achieves comparable performance to GPT-4. In a zero-shot setting, ELLEN also
achieves over 75
trained on gold data. Our code is available at:
https://github.com/hriaz17/ELLEN.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要