Conveyor: Towards Asynchronous Dataflow in Systolic Array to Exploit Unstructured Sparsity

Seongwook Kim, Gwangeun Byeon,Sihyung Kim, Hyungjin Kim,Seokin Hong

2023 IEEE 41ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD(2023)

引用 0|浏览1
暂无评分
摘要
Systolic array (SA) architecture efficiently offers parallel computation using a simple data movement across processing elements. However, their rigid structure and synchronous dataflow limit flexibility in handling sparse computations, resulting in underutilized resources and suboptimal performance. In this paper, we propose Conveyor-SA, a novel SA-based accelerator architecture leveraging asynchronous dataflow for unstructured sparsity exploitation. Conveyor-SA introduces three core mechanisms: Chunk Propagation for parallel data processing, PE Grouping to accelerate efficiently both sparse and dense CNN models, and Conveyor Queue for load imbalance mitigation. Our experimental results demonstrate that Conveyor-SA achieves an average speedup of 1.68x over the competitors while processing conventional CNN models. In addition, Conveyor-SA delivers 1.42x speedup over state-of-the-art sparse SA architecture while remarkably reducing the chip area requirement by 19.5%.
更多
查看译文
关键词
Deep Learning Accelerator,Convolutional Neural Network,Systolic Array
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要