A multi-layer graph analytics to identify bioinformatics tool usage practices from tool directories and PubMed indexed cross-citations

2016 International Computer Science and Engineering Conference (ICSEC)(2016)

引用 0|浏览4
暂无评分
摘要
The essence of bioinformatics is to research, develop, or apply computational tools to biology related data. We confirmed in this paper that bioinformatics tools have been emerging exponentially from 1988-2016. To aide in tool discovery, many directories were established. Though the number of citations to the tools were provided in some directories, none describe how the tools were used. Currently reviewing the literature in a continuous manner remains the key method to keep up with the rapidly changing best practices. To reduce this burden, we proposed a method to systematically gather the documented usage from literature and analyzed them such that the active tool combinations can be derived. Implementation of our method found a total of 4,832 bioinformatics tools, published during 1988-2016, with known PubMed unique identifiers. From January to July 2016, the tools were cited in 13,619 publications. From those publications, 57 function sets (i.e. analysis patterns) were deduced by clustering the usage instances according to the tool functionalities used. A total of 666 tool combinations were observed from those function sets. The top five function sets consisted of 30-98 combinations each; the additional 43 function sets contained 2-9 combinations. The nonhomogeneous tool preferences elicits the search for their influential factors to guide the improvement of tool discovery methods.
更多
查看译文
关键词
literature mining,bioinformatics tool selection,tool usage patterns,bioinformatics pipeline recovery
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要