SMaRTT-REPS: Sender-based Marked Rapidly-adapting Trimmed Timed Transport with Recycled Entropies

Tommaso Bonato, Abdul Kabbani,Daniele De Sensi, Rong Pan, Yanfang Le,Costin Raiciu,Mark Handley,Timo Schneider, Nils Blach, Ahmad Ghalayini,Daniel Alves, Michael Papamichael, Adrian Caulfield,Torsten Hoefler

arxiv(2024)

引用 0|浏览4
暂无评分
摘要
With the rapid growth of machine learning (ML) workloads in datacenters, existing congestion control (CC) algorithms fail to deliver the required performance at scale. ML traffic is bursty and bulk-synchronous and thus requires quick reaction and strong fairness. We show that existing CC algorithms that use delay as a main signal react too slowly and are not always fair. We design SMaRTT, a simple sender-based CC algorithm that combines delay, ECN, and optional packet trimming for fast and precise window adjustments. At the core of SMaRTT lies the novel QuickAdapt algorithm that accurately estimates the bandwidth at the receiver. We show how to combine SMaRTT with a new per-packet traffic load-balancing algorithm called REPS to effectively reroute packets around congested hotspots as well as flaky or failing links. Our evaluation shows that SMaRTT alone outperforms EQDS, Swift, BBR, and MPRDMA by up to 50
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要