SCALE: Constructing Structured Natural Language Comment Trees for Software Vulnerability Detection
CoRR(2024)
摘要
Recently, there has been a growing interest in automatic software
vulnerability detection. Pre-trained model-based approaches have demonstrated
superior performance than other Deep Learning (DL)-based approaches in
detecting vulnerabilities. However, the existing pre-trained model-based
approaches generally employ code sequences as input during prediction, and may
ignore vulnerability-related structural information, as reflected in the
following two aspects. First, they tend to fail to infer the semantics of the
code statements with complex logic such as those containing multiple operators
and pointers. Second, they are hard to comprehend various code execution
sequences, which is essential for precise vulnerability detection.
To mitigate the challenges, we propose a Structured Natural Language Comment
tree-based vulnerAbiLity dEtection framework based on the pre-trained models,
named SCALE. The proposed Structured Natural Language Comment Tree (SCT)
integrates the semantics of code statements with code execution sequences based
on the Abstract Syntax Trees (ASTs). Specifically, SCALE comprises three main
modules: (1) Comment Tree Construction, which aims at enhancing the model's
ability to infer the semantics of code statements by first incorporating Large
Language Models (LLMs) for comment generation and then adding the comment node
to ASTs. (2) Structured Natural Language Comment Tree Construction}, which aims
at explicitly involving code execution sequence by combining the code syntax
templates with the comment tree. (3) SCT-Enhanced Representation, which finally
incorporates the constructed SCTs for well capturing vulnerability patterns.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要