YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using SIMD Architectures on CPUs

Cyrus Zhou, Zack Hassman, Dhirpal Shah,Vaughn Richard,Yanjing Li

PROCEEDINGS OF THE 33RD ACM SIGPLAN INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION, CC 2024（2024）

引用 0|浏览2

暂无评分

摘要

We address the challenges associated with deploying neural networks on CPUs, with a particular focus on minimizing inference time while maintaining accuracy. Our novel approach is to use the dataflow (i.e., computation order) of a neural network to explore data reuse opportunities using heuristic-guided analysis and a code generation framework, which enables exploration of various Single Instruction, Multiple Data (SIMD) implementations to achieve optimized neural network execution. Our results demonstrate that the dataflowthat keeps outputs in SIMD registers while also maximizing both input and weight reuse consistently yields the best performance for a wide variety of inference workloads, achieving up to 3x speedup for 8-bit neural networks, and up to 4.8x speedup for binary neural networks, respectively, over the optimized implementations of neural networks today.

查看译文

关键词

code generation,compiler support,SIMD vectorization,CPU optimization,data~ow,neural network

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要