Formal Language Theory for Practical Security - Extended Abstract -

2021 IEEE Security and Privacy Workshops (SPW)(2021)

引用 0|浏览14
暂无评分
摘要
When binary data are sent from one party to another one, the encoding of the data can be described as a “data serialisation” language (DaSeL). Many DaSeLs employ the “length-prefix” pattern for strings, containers and other data items of variable length. This consists of an encoding of the item’s length, followed by an encoding of the item itself without closing brackets or “end” symbols. The receiver must determine the final byte from the length read before. Lengthprefix languages are not context-free. Thus, the plethora of tools and methods to specify, analyse, and parse context-free languages appears to be useless for length-prefix languages. This seems to explain why improper specffications of length-prefix languages and buggy hand-written parsers are so often a root cause for security issues and exploits, as, e.g., in the case of the famous Heartbleed bug. One might even be tempted to consider the use of length-prefix languages a security hazard. But this consideration would be wrong. We present a transformation of words from “calc-context-free” languages (a superset of context-free and length-prefix languages) into words from proper context-free languages. The transformation actually allows to use tools from context-free languages to deal with length-prefix languages. Our transformation runs on a Turing machine with logarithmic space. This implies the theoretical result of calc-context-free languages being in the complexity class log$C\mathcal{F}\mathcal{L}$. Similarly, deterministic calc-context-free languages are in log$\mathcal{D}\mathcal{C}\mathcal{F}\mathcal{L}$. To run in linear time, one needs to enhance the Turing machine by a stack to store additional data.
更多
查看译文
关键词
data serialization,parsing,formal languages,calc EBNF,calc contect free,length prefix
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要