NNOSE: Nearest Neighbor Occupational Skill Extraction
CoRR(2024)
摘要
The labor market is changing rapidly, prompting increased interest in the
automatic extraction of occupational skills from text. With the advent of
English benchmark job description datasets, there is a need for systems that
handle their diversity well. We tackle the complexity in occupational skill
datasets tasks – combining and leveraging multiple datasets for skill
extraction, to identify rarely observed skills within a dataset, and overcoming
the scarcity of skills across datasets. In particular, we investigate the
retrieval-augmentation of language models, employing an external datastore for
retrieving similar skills in a dataset-unifying manner. Our proposed method,
Nearest Neighbor Occupational Skill
Extraction (NNOSE) effectively leverages multiple datasets by
retrieving neighboring skills from other datasets in the datastore. This
improves skill extraction without additional fine-tuning. Crucially, we
observe a performance gain in predicting infrequent patterns, with substantial
gains of up to 30% span-F1 in cross-dataset settings.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要