Dataflow-Aware PIM-Enabled Manycore Architecture for Deep Learning Workloads
arxiv(2024)
摘要
Processing-in-memory (PIM) has emerged as an enabler for the energy-efficient
and high-performance acceleration of deep learning (DL) workloads. Resistive
random-access memory (ReRAM) is one of the most promising technologies to
implement PIM. However, as the complexity of Deep convolutional neural networks
(DNNs) grows, we need to design a manycore architecture with multiple
ReRAM-based processing elements (PEs) on a single chip. Existing PIM-based
architectures mostly focus on computation while ignoring the role of
communication. ReRAM-based tiled manycore architectures often involve many
Processing Elements (PEs), which need to be interconnected via an efficient
on-chip communication infrastructure. Simply allocating more resources (ReRAMs)
to speed up only computation is ineffective if the communication infrastructure
cannot keep up with it. In this paper, we highlight the design principles of a
dataflow-aware PIM-enabled manycore platform tailor-made for various types of
DL workloads. We consider the design challenges with both 2.5D interposer- and
3D integration-enabled architectures.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要