CNN-on-AWS: Efficient Allocation of Multikernel Applications on Multi-FPGA Platforms

Periodicals(2021)

引用 16|浏览51
暂无评分
摘要
AbstractMulti-FPGA platforms, like Amazon AWS F1, can run in the cloud multikernel pipelined applications, like convolutional neural networks (CNNs), with excellent performance and lower energy consumption than CPUs or GPUs. We propose a method to efficiently map these applications on multi-FPGA platforms to maximize the application throughput. Our methodology finds, for the given resources, the optimal number of parallel instances of each kernel in the pipeline and their allocation to one or more among the available FPGAs. We obtain this by formulating and solving a mixed-integer, nonlinear optimization problem, in which we model the performance of each component and the duration of the phases in which the accelerated computation can be split into, namely: 1) data transfer from a host CPU to the DDR memory of each FPGA; 2) data transfer from FPGA DDR to FPGA on-chip memory; 3) kernel computation on the FPGA; 4) data transfer from FPGA on-chip memory to FPGA DDR; and 5) data transfer from FPGA DDR to host. Finding the optimal solution using a mixed-integer nonlinear programming (MINLP) solver is often highly inefficient. Hence, we provide a fast heuristic method that according to our experiments can be much more efficient than the MINLP solver and finds comparable results. For larger problems (more CNN layers), our heuristic method can quickly find (several thousand times faster) much better solutions than the MINLP solver, even if we run the latter for a very long time.
更多
查看译文
关键词
Field programmable gate arrays, Kernel, Pipelines, Resource management, Data transfer, Throughput, Task analysis, Allocation, convolutional neural networks (CNNs), multi-FPGA, optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要