What the Fork? Finding Hidden Code Clones in npm

2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022)(2022)

引用 4|浏览16
暂无评分
摘要
This work presents findings and mitigations on an understudied issue, which we term shrinkwrapped clones, that is endemic to the npm software package ecosystem. A shrinkwrapped clone is a package which duplicates, or near-duplicates, the code of another package without any indication or reference to the original package. This phenomenon represents a challenge to the hygiene of package ecosystems, as a clone package may siphon interest from the package being cloned, or create hidden duplicates of vulnerable, insecure code which can fly under the radar of audit processes. Motivated by these considerations, we propose unwrapper, a mechanism to programmatically detect shrinkwrapped clones and match them to their source package. unwrapper uses a package difference metric based on directory tree similarity, augmented with a prefilter which quickly weeds out packages unlikely to be clones of a target. Overall, our prototype can compare a given package within the entire npm ecosystem (1,716,061 packages with 20,190,452 different versions) in 72.85 seconds, and it is thus practical for live deployment. Using our tool, we performed an analysis of a subset of npm packages, which resulted in finding up to 6,292 previously unknown shrinkwrapped clones, of which up to 207 carried vulnerabilities from the original package that had already been fixed in the original package. None of such vulnerabilities were discoverable via the standard npm audit process.
更多
查看译文
关键词
code clone,clone,npm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要