Evaluating Automated Program Repair Techniques using Introductory Programming Course Datasets

2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022)(2022)

引用 0|浏览19
暂无评分
摘要
Debugging erroneous programs requires a great deal of human effort. To reduce human effort, automating debugging processes has been actively studied so far. One of such automation is automated program repair techniques for syntactic errors in programs. Researchers intend to support novice programmers such as students with these techniques because fixing syntactic errors is a difficult task for novice programmers. However, there exist few datasets that consist of programs written by novice programmers in universities and can be used to evaluate these techniques. Also, it is difficult to prepare such datasets from scratch. Indeed, prior studies usually utilized the Indian Institute of Technology Kanpur (IITK) dataset only. This limitation restricts the findings and implications in prior studies as a case study in the university. In this study, we intend to clarify which findings and implications in prior studies remain the same and which ones change in another university by a case study. We prepare three datasets that consist of over 21k programs collected from an introductory programming course in different divisions at our university. We compare the state-of-the-art automated program repair techniques, DeepFix, RLAssist, and Dr Repair, in these datasets. We found that (1) the best technique remains the same in all the datasets, (2) these techniques fix 8.3 % to 54.5 % syntactic errors in our datasets, which are 6.7 % to 32.4 % lower than those in the IITK dataset, (3) the error types that are fixed by them change in different datasets. Hence, the main finding in the IITK dataset remains the same; however, each technique fixes different errors.
更多
查看译文
关键词
automated program repair, syntactic error
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要