Optimal Round and Sample-Size Complexity for Partitioning in Parallel Sorting

PROCEEDINGS OF THE 35TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, SPAA 2023(2023)

引用 0|浏览5
暂无评分
摘要
State-of-the-art parallel sorting algorithms for distributed-memory architectures are based on computing a balanced partitioning via sampling and histogramming. By finding samples that partition the sorted keys into evenly-sized chunks, these algorithms minimize the number of communication rounds required. Histogramming (computing positions of samples) guides sampling, enabling a decrease in the overall number of samples collected. We derive lower and upper bounds on the number of sampling/histogramming rounds required to compute a balanced partitioning. We improve on prior results to demonstrate that when using p processors, O(log* p) rounds with O(p/log* p) samples per round suffice. We match that with a lower bound that shows that any algorithm with O(p) samples per round requires at least Omega(log* p) rounds. Additionally, we prove the Omega(p log p) samples lower bound for one round, thus proving that existing one round algorithms: sample sort, AMS sort [2] and HSS [16] have optimal sample size complexity. To derive the lower bound, we propose a hard randomized input distribution and apply classical results from the distribution theory of runs.
更多
查看译文
关键词
parallel algorithms,parallel sorting,communication cost,communication lower bounds
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要