Presentation
Evaluating Tuning Opportunities of the LLVM/OpenMP Runtime
DescriptionTuning parallel applications on multi-core architectures is an arduous task. Several studies have utilized auto-tuning for OpenMP applications via standardized user-facing features, namely number of threads, thread placement and binding policy. In this paper, we analyze OpenMP application runtime through an exhaustive exploration of all relevant configuration options of the LLVM/OpenMP runtime.
Our findings allow to identify trends in tuning potential and architecture-aware tuning suggestions. We will open-source the 240,000 unique samples collected during experiments. These runs have been conducted on three different CPU architectures vital in the HPC and datacenter community. Choice of applications includes popular benchmark suites and microbenchmarks namely, NPB, Barcelona OpenMP Task Suite, XSBench, RSBench, SU3Bench and LULESH.
We employ Machine Learning algorithms to perform analysis, explain, and form qualitative relations between features comprising of the architecture, application, input-size, threads, and environment variables. This is further used to recommend different configurations given an application type/architecture.
Our findings allow to identify trends in tuning potential and architecture-aware tuning suggestions. We will open-source the 240,000 unique samples collected during experiments. These runs have been conducted on three different CPU architectures vital in the HPC and datacenter community. Choice of applications includes popular benchmark suites and microbenchmarks namely, NPB, Barcelona OpenMP Task Suite, XSBench, RSBench, SU3Bench and LULESH.
We employ Machine Learning algorithms to perform analysis, explain, and form qualitative relations between features comprising of the architecture, application, input-size, threads, and environment variables. This is further used to recommend different configurations given an application type/architecture.