Tutorial "Kokkos: Enabling many-core performance portability through polymorphic memory access patterns"
The many-core revolution can be characterized by increasing thread counts, decreasing memory per thread, and diversity of continually evolving many-core architectures.High performance computing (HPC) applications and libraries must exploit increasingly finer levels of parallelism within their codes to sustain scalability on these devices. A major obstacle to performance portability is the diverse and conflicting set of constraints on memory access patterns across devices. Contemporary portable programming models address many-core parallelism (e.g., OpenMP, OpenACC, OpenCL) but fail to address memory access patterns.
The Kokkos C++ library enables applications and domain libraries to achieve performance portability on diverse many-core architectures by unifying abstractions for both fine-grain data parallelism and memory access patterns. In this tutorial we describe Kokkos’ abstractions, summarize its application programmer interface (API), present performance results for unit-test kernels and mini-applications, and outline an incremental strategy for migrating legacy C++ codes to Kokkos. The Kokkos library is under active research and development to incorporate capabilities from new generations of many-core architectures, and to address a growing list of applications and domain libraries.
Tutor:
Dr. Christian Trott
Sandia National Laboratories
Albuquerque, NM, 87185, United States
Contact:
Dr. Godehard Sutmann
Forschungszentrum Jülich
Jülich Supercomputing Centre
g.sutmann@fz-juelich.de
This is a tutorial is open and intended for people interested in computational science and programming models. The tutorial is not limited to C++ experts, but also gives a broad introduction into performance portability. PhD students are especially invited. Please bring your own laptop.
Please register with g.sutmann@fz-juelich.de until 4 October 2018.