Statistical learning of API fully qualified names in code snippets of online forums.

Hung Phan,Hoan Anh Nguyen,Ngoc M. Tran,Linh H. Truong,Anh Tuan Nguyen,Tien N. Nguyen

ICSE（2018）

引用 49|浏览62

暂无评分

摘要

Software developers often make use of the online forums such as StackOverflow (SO) to learn how to use software libraries and their APIs. However, the code snippets in such a forum often contain undeclared, ambiguous, or largely unqualified external references. Such declaration ambiguity and external reference ambiguity present challenges for developers in learning to correctly use the APIs. In this paper, we propose StatType, a statistical approach to resolve the fully qualified names (FQNs) for the API elements in such code snippets. Unlike existing approaches that are based on heuristics, StatType has two well-integrated factors. We first learn from a large training code corpus the FQNs that often co-occur. Then, to derive the FQN for an API name in a code snippet, we use that knowledge and also leverage the context consisting of neighboring API names. To realize those factors, we treat the problem as statistical machine translation from source code with partially qualified names to source code with FQNs of the APIs. Our empirical evaluation on real-world code and StackOverflow posts shows that StatType achieves very high accuracy with 97.6% precision and 96.7% recall, which is 16.5% relatively higher than the state-of-the-art approach.

查看译文

关键词

Type Resolution, Type Inference, Type Annotations, Partial Program Analysis, Statistical Machine Translation, Naturalness, Big Code

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要