Presentation
FFT-Based Spherical Harmonics and Radial Transforms on GPU
DescriptionModern high-performance computing clusters switch to the GPUs, as opposed to CPUs, as the source of their computational power. GPUs are tailored for data-parallel algorithms where multiple cores perform the same operations on different memory locations. However, making CPU code run within GPU constraints is often a non-trivial task. Firstly, not all algorithms are easy to parallelize. Secondly, there is no single way to program GPUs from different manufacturers — all of them try to promote their own solutions. To solve this issue, a runtime GPU code generation and optimization platform, PfSolve, has been developed during this PhD. Originally based on VkFFT (Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library), PfSolve has been generalized and restructured.
QuiCC is a code under development in our research group designed to solve the equations of magnetohydrodynamics in a full sphere and other geometries. It uses a fully spectral approach to the problem, with the Jones-Worland (JW) polynomials as a radial basis and spherical harmonics (SH) as a spherical basis. The main goal of this dissertation is a GPU implementation of the FFT-based algorithm for their evaluation, which is more accurate and requires less memory than the regular quadrature approach. One of the main building blocks used by it is the efficient tridiagonal GPU solver, developed with the new warp-programming approach.
This work also presents additional algorithms redesigned within the platform, such as finite differences solver and double-double emulation of FP128 calculations on GPUs.
QuiCC is a code under development in our research group designed to solve the equations of magnetohydrodynamics in a full sphere and other geometries. It uses a fully spectral approach to the problem, with the Jones-Worland (JW) polynomials as a radial basis and spherical harmonics (SH) as a spherical basis. The main goal of this dissertation is a GPU implementation of the FFT-based algorithm for their evaluation, which is more accurate and requires less memory than the regular quadrature approach. One of the main building blocks used by it is the efficient tridiagonal GPU solver, developed with the new warp-programming approach.
This work also presents additional algorithms redesigned within the platform, such as finite differences solver and double-double emulation of FP128 calculations on GPUs.