Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages
arxiv(2023)
摘要
The expressive power of transformers over inputs of unbounded size can be
studied through their ability to recognize classes of formal languages. We
consider transformer encoders with hard attention (in which all attention is
focused on exactly one position) and strict future masking (in which each
position only attends to positions strictly to its left), and prove that they
are equivalent to linear temporal logic (LTL), which defines exactly the
star-free languages. A key technique is the use of Boolean RASP as a convenient
intermediate language between transformers and LTL. We then take numerous
results known for LTL and apply them to transformers, characterizing how
position embeddings, strict masking, and depth increase expressive power.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要