Learning to design protein-protein interactions with enhanced generalization
arxiv(2023)
摘要
Discovering mutations enhancing protein-protein interactions (PPIs) is
critical for advancing biomedical research and developing improved
therapeutics. While machine learning approaches have substantially advanced the
field, they often struggle to generalize beyond training data in practical
scenarios. The contributions of this work are three-fold. First, we construct
PPIRef, the largest and non-redundant dataset of 3D protein-protein
interactions, enabling effective large-scale learning. Second, we leverage the
PPIRef dataset to pre-train PPIformer, a new SE(3)-equivariant model
generalizing across diverse protein-binder variants. We fine-tune PPIformer to
predict effects of mutations on protein-protein interactions via a
thermodynamically motivated adjustment of the pre-training loss function.
Finally, we demonstrate the enhanced generalization of our new PPIformer
approach by outperforming other state-of-the-art methods on new, non-leaking
splits of standard labeled PPI mutational data and independent case studies
optimizing a human antibody against SARS-CoV-2 and increasing the thrombolytic
activity of staphylokinase.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要