KokkACC: Enhancing Kokkos with OpenACC

Pedro Valero-Lara,Seyong Lee,Marc Gonzalez-Tallada,Joel Denny,Jeffrey S. Vetter

2022 Workshop on Accelerator Programming Using Directives (WACCPD)（2022）

引用 2|浏览13

暂无评分

摘要

Template metaprogramming is gaining popularity as a high-level solution for achieving performance portability on heterogeneous computing resources. Kokkos is a representative approach that offers programmers high-level abstractions for generic programming while most of the device-specific code generation and optimizations are delegated to the compiler through template specializations. For this, Kokkos provides a set of device-specific code specializations in multiple back ends, such as CUDA and HIP. Unlike CUDA or HIP, OpenACC is a high-level and directive-based programming model. This descriptive model allows developers to insert hints (pragmas) into their code that help the compiler to parallelize the code. The compiler is responsible for the transformation of the code, which is completely transparent to the programmer. This paper presents an OpenACC back end for Kokkos: KokkACC. As an alternative to Kokkos’s existing device-specific back ends, KokkACC is a multi-architecture back end providing a high-productivity programming environment enabled by OpenACC’s high-level and descriptive programming model. Moreover, we have observed competitive performance; in some cases, KokkACC is faster (up to 9×) than NVIDIA’s CUDA back end and much faster than OpenMP’s GPU offloading back end. This work also includes implementation details and a detailed performance study conducted with a set of mini-benchmarks (AXPY and DOT product) and three mini-apps (LULESH, miniFE and SNAP, a LAMMPS proxy mini-app).

查看译文

关键词

OpenACC,C++ Metaprogramming,Kokkos,CUDA,OpenMP Target,Parallel Programming Models

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要