GPU Optimizations for Atmospheric Chemical Kinetics

Theodoros Christoudias, Timo Kirfel, Astrid Kerkweg, Domenico Taraborrelli, Georges-Emmanuel Moulard, Erwan Raffin, Victor Azizi, Gijs van den Oord, Ben van Werkhoven
Abstract
We present a series of optimizations to alleviate stack memory overflow issues and improve overall performance of GPU computational kernels in atmospheric chemical kinetics model simulations. We use heap memory in numerical solvers for stiff ODEs, move chemical reaction constants and tracer concentration arrays from stack to global memory, use direct pointer indexing for array memory access, and use CUDA streams to overlap computation with memory transfer to the device. Overall, an order of magnitude reduction in GPU memory requirements is achieved, allowing for simultaneous offloading from multiple MPI processes per node and/or increasing the chemical mechanism complexity.