Improving The Generation Of Infoboxes From Data Silos Through Machine Learning And The Use Of Semantic Repositories

Angel L. Garrido,Susana Sangiao,Oscar Cardiel

INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS（2017）

引用 7|浏览1

暂无评分

摘要

Nowadays, both public and private organizations own large private text-based data repositories with critical information. The information stored in these data silos is usually queried through information retrieval systems based on indexes, which yield hundreds or thousands of results when interrogated using keywords. In order to improve data accessibility when searching for specific information, the use of infoboxes can be very useful. The generation such infoboxes is by itself a complex problem, but in this type of isolated environments, it becomes even harder as the selection of the entities and their attributes can be conditioned by local and very specific parameters. In this work, we propose a methodology to tackle this special problem, combining classical approaches with machine learning, and leveraging the resources provided by the Semantic Web. The working methodology has been applied to two well-known datasets, and also it has been tested on a real environment scenario, showing the feasibility of our approach.

查看译文

关键词

Infoboxes, named entity disambiguation, information extraction, machine learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要