When Code Smells Meet ML: On the Lifecycle of ML-specific Code Smells in ML-enabled Systems
CoRR(2024)
摘要
Context. The adoption of Machine Learning (ML)–enabled systems is steadily
increasing. Nevertheless, there is a shortage of ML-specific quality assurance
approaches, possibly because of the limited knowledge of how quality-related
concerns emerge and evolve in ML-enabled systems. Objective. We aim to
investigate the emergence and evolution of specific types of quality-related
concerns known as ML-specific code smells, i.e., sub-optimal implementation
solutions applied on ML pipelines that may significantly decrease both the
quality and maintainability of ML-enabled systems. More specifically, we
present a plan to study ML-specific code smells by empirically analyzing (i)
their prevalence in real ML-enabled systems, (ii) how they are introduced and
removed, and (iii) their survivability. Method. We will conduct an exploratory
study, mining a large dataset of ML-enabled systems and analyzing over 400k
commits about 337 projects. We will track and inspect the introduction and
evolution of ML smells through CodeSmile, a novel ML smell detector that we
will build to enable our investigation and to detect ML-specific code smells.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要