Presentation
Increasing Energy Efficiency of Astrophysics Simulations Through GPU Frequency Scaling
SessionSustainable Supercomputing
DescriptionThe growing demand for HPC necessitates significant energy consumption, posing a sustainability challenge for HPC centers, users, and society, especially
due to stricter environmental regulations. While efforts exist to reduce overall system energy consumption, optimizations
for GPU-based workloads, has received insufficient attention for workload-specific energy-efficiency optimizations. This work addresses this issue by proposing
dynamic approaches to increase energy-efficiency by controlling the GPU frequency dynamically through code instrumentation.
We further investigate
the energy-performance trade-off by comparing both static and dynamic GPU frequency scaling strategies as well as DVFS within SPH-EXA, a newly
developed, open-source, and GPU-centric simulation framework
specializing in astrophysical simulations. Our
findings demonstrate that code instrumentation enables detailed
energy consumption acquisition beyond traditional HPC system
monitoring, while dynamic frequency scaling of computational
kernels achieves energy reduction with limited performance loss.
This approach empowers researchers to develop more sustainable
large-scale scientific simulations running mainly on GPUs.
due to stricter environmental regulations. While efforts exist to reduce overall system energy consumption, optimizations
for GPU-based workloads, has received insufficient attention for workload-specific energy-efficiency optimizations. This work addresses this issue by proposing
dynamic approaches to increase energy-efficiency by controlling the GPU frequency dynamically through code instrumentation.
We further investigate
the energy-performance trade-off by comparing both static and dynamic GPU frequency scaling strategies as well as DVFS within SPH-EXA, a newly
developed, open-source, and GPU-centric simulation framework
specializing in astrophysical simulations. Our
findings demonstrate that code instrumentation enables detailed
energy consumption acquisition beyond traditional HPC system
monitoring, while dynamic frequency scaling of computational
kernels achieves energy reduction with limited performance loss.
This approach empowers researchers to develop more sustainable
large-scale scientific simulations running mainly on GPUs.