SMaRTT-REPS: Sender-based Marked Rapidly-adapting Trimmed Timed Transport with Recycled Entropies
arxiv(2024)
摘要
With the rapid growth of machine learning (ML) workloads in datacenters,
existing congestion control (CC) algorithms fail to deliver the required
performance at scale. ML traffic is bursty and bulk-synchronous and thus
requires quick reaction and strong fairness. We show that existing CC
algorithms that use delay as a main signal react too slowly and are not always
fair. We design SMaRTT, a simple sender-based CC algorithm that combines delay,
ECN, and optional packet trimming for fast and precise window adjustments. At
the core of SMaRTT lies the novel QuickAdapt algorithm that accurately
estimates the bandwidth at the receiver. We show how to combine SMaRTT with a
new per-packet traffic load-balancing algorithm called REPS to effectively
reroute packets around congested hotspots as well as flaky or failing links.
Our evaluation shows that SMaRTT alone outperforms EQDS, Swift, BBR, and MPRDMA
by up to 50
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要