Towards more efficient methods for solving regular-expression heavy string constraints

Theoretical Computer Science(2023)

引用 1|浏览29
暂无评分
摘要
Widespread use of string solvers in the formal analysis of string-heavy programs has led to a growing demand for more efficient and reliable techniques which can be applied in this context. Designing practical algorithms for the (generally undecidable) satisfiability problem for systems of string constraints requires a deep understanding of the structure of constraints present in the targeted cases. In this paper, we first investigate relevant benchmarks containing regular expression membership predicates, extract a series of first order logic theories, and prove the decidability, respectively undecidability, of their satisfiability problem. Further, building on the theoretical results, we present a novel length-aware solving algorithm for the quantifier-free first-order theory over regular expression membership predicates and linear arithmetic over string length. We implement and evaluate this algorithm and related heuristics in the Z3 theorem prover. Besides the theoretical aspects leading to this algorithm, a crucial insight that underpins it is that real -world regex and string formulas contain a wealth of information about upper and lower bounds on lengths of strings, and such information can be used very effectively to simplify operations on automata representing regular expressions. Additionally, we present several novel general heuristics, such as the prefix/suffix method, that can be used to make a variety of regex solving algorithms more efficient in practice. We showcase the power of our algorithm and heuristics via an extensive empirical evaluation over a large and diverse benchmark of 57256 regex-heavy instances, almost 75% of which are derived from industrial applications or contributed by other solver developers. Our solver outperforms five other state-of-the-art string solvers over this benchmark. (c) 2022 Elsevier B.V. All rights reserved.
更多
查看译文
关键词
SMT solving,String solving,String constraints,Regular expressions,Linear length constraints,Heuristics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要