Agreement based source selection for the multi-topic deep web integration

Proceedings of the 17th International Conference on Management of Data(2014)

引用 23|浏览25
暂无评分
摘要
One immediate challenge in searching the deep web databases is source selection---i.e. selecting the most relevant web databases for answering a given query. For open collections like the deep web, the source selection must be sensitive to trustworthiness and importance of sources. Recent advances solve these problems for a single topic deep web search adapting an agreement based approach (c.f. SourceRank [10]). In this paper we introduce a source selection method sensitive to trust and importance for multi topic deep web search. We compute multiple quality scores of a source tailored to different topics, based on the topic specific crawl data. At the query time, we classify the query to determine its probability of membership in different topics. These fractional memberships are used as the weights to the topic specific quality scores of sources to select sources for the query. Extensive experiments on more than a thousand sources in multiple topics show 18-85% improvements in result quality over Google Product Search and other existing methods.
更多
查看译文
关键词
deep web,deep web databases,multiple topic,multi-topic deep web integration,multi topic,query time,relevant web databases,different topic,source selection,single topic,deep web search
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要