Svelto: High-Level Synthesis of Multi-Threaded Accelerators for Graph Analytics

Marco Minutoli,Vito Giovanni Castellana,Nicola Saporetti,Stefano Devecchi,Marco Lattuada,Pietro Fezzardi,Antonino Tumeo,Fabrizio Ferrandi

IEEE Transactions on Computers（2022）

引用 17|浏览46

暂无评分

摘要

Graph analytics are an emerging class of irregular applications. Operating on very large datasets, they present unique behaviors, such as fine-grained, unpredictable memory accesses, and highly unbalanced task level parallelism, that make existing high-performance general-purpose processors or accelerators (e.g., GPUs) suboptimal. To address these issues, research and industry are developing a variety of custom accelerator designs for this application area, including solutions based on reconfigurable devices (Field Programmable Gate Arrays). These new approaches often employ High-Level Synthesis (HLS) to Speed up the development of the accelerators. In this paper, we propose a novel architecture template for the automatic generation of accelerators for graph analytics and irregular applications. The architecture template includes a dynamic task scheduling mechanism, a parallel array of accelerators that enables supporting task-level parallelism with context switching, and a related multi-channel memory interface that decouples communication from computation and provides support for fine-grained atomic memory operations. We discuss the integration of the architectural template in an HLS flow, presenting the necessary modifications to enable automatic generation of the custom architectures starting from OpenMP annotated code. We evaluate our approach first by synthesizing and exploring triangle counting, a common graph algorithm, and then by synthesizing custom designs for a set of graph database benchmark queries, representing series of graph pattern matching routines. We compare the synthesized accelerators with previous state-of-the-art methodologies for the synthesis of parallel architectures, showing that the proposed approach allows reducing resource usage by optimizing the number of accelerators replicas without any performance penalty.

查看译文

关键词

Parallel architectures,multi-threading,dynamic task scheduling,context switching,RDF,SPARQL,high performance data analytics,big data

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要