Characterizing the costs and benefits of hardware parallelism in accelerator cores
Computer Design(2013)
摘要
Power and utilization constraints are limiting the performance gains of traditional architectures. Designers are increasingly embracing specialization to improve performance in the era of dark-silicon. General purpose processors are beginning to resemble SOC's from the embedded domain, and now include many specialized accelerator cores to improve computation-throughput while reducing the energy-cost of computation. The design-space of accelerator cores is wide and varied. Designers are able to specify how much parallelism to expose in hardware by varying input width, pipeline depth, number of compute-lanes, etc. In this paper we study three accelerator cores: DES, FFT, and Jacobi Transform, exhibiting three different types of computation: streaming cryptographic, butterfly DSP, and stencil. We investigate methods to increase parallelism within the accelerator while remaining on the pareto-frontier, and examine the trade-offs faced by designers with respect to area, power, and throughput. We present models of these trade-offs and provide insight into the design of cores under real-world constraints.
更多查看译文
关键词
Jacobian matrices,cryptography,digital signal processing chips,fast Fourier transforms,multiprocessing systems,parallel processing,DES accelerator cores,FFT accelerator cores,Jacobi transform accelerator cores,Pareto frontier,accelerator cores,butterfly DSP,computation-throughput,dark-silicon era,data encryption standard,energy-cost reduction,general purpose processors,hardware parallelism benefits,hardware parallelism costs,performance gains,power constraints,stencil,streaming cryptographic,utilization constraints,Accelerator architectures,Analytical models,Computer architecture,System-on-chip
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要