Using MEDLINE as a knowledge source for disambiguating abbreviations and acronyms in full-text biomedical journal articles.

JOURNAL OF BIOMEDICAL INFORMATICS(2007)

引用 18|浏览2
暂无评分
摘要
Biomedical abbreviations and acronyms are widely used in biomedical literature. Since many of them represent important content in biomedical literature, information retrieval and extraction benefits from identifying the meanings of those terms. On the other hand, many abbreviations and acronyms are ambiguous, it would be important to map them to their full forms, which ultimately represent the meanings of the abbreviations. In this study, we present a semi-supervised method that applies MEDLINE as a knowledge source for disambiguating abbreviations and acronyms in full-text biomedical journal articles. We first automatically generated from the MEDLINE abstracts a dictionary of abbreviation-full pairs based on a rule-based system that maps abbreviations to full forms when full forms are defined in the abstracts. We then trained on the MEDLINE abstracts and predicted the full forms of abbreviations in full-text journal articles by applying supervised machine-learning algorithms in a semi-supervised fashion. We report up to 92% prediction precision and up to 91% coverage.
更多
查看译文
关键词
word-sense disambiguation,machine learning,MEDLINE,full-text
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要