High Performance Binary Neural Networks On The Xeon Plus Fpga (Tm) Platform

2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL)(2017)

引用 0|浏览4
暂无评分
摘要
Convolutional neural networks (CNNs) are deployed in a wide range of image recognition, scene segmentation and object detection applications. Achieving state of the art accuracy in CNNs often results in large models and complex topologies that require significant compute resources to complete in a timely manner. Binarised neural networks (BNNs) have been proposed as an optimised variant of CNNs, which constrain the weights and activations to + 1 or 1 and thus offer compact models and lower computational complexity per operation.This paper presents a high performance BNN accelerator on the Intel (R) Xeon+ FPGA (TM) platform. The proposed accelerator is designed to take advantage of the Xeon+FPGA system in a way that a specialised FPGA architecture can be targeted for the most compute intensive parts of the BNN whilst other parts of the topology can be handled by the Xeon (TM) CPU. The implementation is evaluated by comparing the raw compute performance and energy efficiency for key layers in standard CNN topologies against an Nvidia Titan X Pascal GPU and other published FPGA BNN accelerators. The results show that our single-package integrated Arria (TM) 10 FPGA accelerator coupled with a high-end Xeon CPU can offer comparable performance and better energy efficiency than a high-end discrete Titan X GPU card. In addition, our solution delivers the best performance compared to previous BNN FPGA implementations.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要