Sky-Sorter: A Processing-in-Memory Architecture for Large-Scale Sorting

IEEE Transactions on Computers(2023)

引用 3|浏览57
暂无评分
摘要
Sorting is one of the most important algorithms in computer science. Conventional CPUs, GPUs, FPGAs, and ASICs running sorting are fundamentally bottlenecked by the off-chip memory bandwidth, because of their von-Neumann architecture. Processing-near-memory (PNM) designs integrate a CPU, a GPU or an ASIC upon an HBM for sorting, but their sorting throughput are still limited by the HBM bandwidth and capacity. In this paper, we propose a skyrmion racetrack memory (SRM) -based PIM accelerator, Sky-Sorter , to enhance the sorting performance of large-scale datasets. Sky-Sorter implements samplesort which involves four steps, sampling, splitting marker sorting, partitioning, and bucket sorting. An SRM-based random number generator (TRNG) is used in Sky-Sorter to randomly sample records from the dataset. Sky-Sorter divides the large dataset into many buckets based on sampled splitting markers by our proposed SRM-based partitioner. Its partitioning throughput matches the off-chip memory bandwidth. We further designed an SRM-based sorting unit (SU) to sort all records of a bucket without introducing extra CMOS logic. Our SU uses the fast in-cell insertion characteristics of SRMs to implement and perform insertsort within a bucket. Sky-Sorter employs SUs to sort all buckets simultaneously by fully utilizing large internal array bandwidth. Compared to state-of-the-art accelerators, Sky-Sorter improves the throughput per Watt by $\sim 4\times$ .
更多
查看译文
关键词
Processing-in-memory,large-scale sorting
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要