Understanding the Behavior of Pthread Applications on Non-Uniform Cache Architectures

Parallel Architectures and Compilation Techniques(2011)

引用 2|浏览4
暂无评分
摘要
Future scalable multi-core chips are expected to implement a shared last-level cache (LLC) with banks distributed on chip, forcing a core to incur non-uniform access latencies to each bank. Consequently, high performance and energy efficiency depend on whether a thread's data is placed in local or nearby banks. Using compiler and programmer support, we aim to find an alternative solution to existing high-overhead designs. In this paper, we take existing parallel programs written in Pthreads, and show the performance gap between current static mapping schemes, costly migration schemes and idealized static and dynamic best-case scenarios.
更多
查看译文
关键词
non-uniform cache architectures,pthread applications,costly migration scheme,nearby bank,alternative solution,high-overhead design,high performance,current static mapping scheme,future scalable multi-core chip,dynamic best-case scenario,performance gap,energy efficiency,chip,data structure,data structures,system on a chip,resource management,performance,parallel programming,energy efficient,resource manager
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要