Presentation
Enhancing Small Message Aggregation with Directive-Based Deferred Execution
DescriptionThe partitioned global address space (PGAS) model offers one-sided communication operations to efficiently access local and remote data through a distributed shared memory model using point-to-point network operations. An extension to the OpenSHMEM PGAS library previously demonstrated how message aggregation could be applied in a minimally intrusive manner to an application, while still achieving a significant portion of the performance possible through manual tuning. However, its primary deficiency was the inability to abstract dependencies between aggregated remote memory accesses and their subsequent uses, which must be managed explicitly by applications. This undermined its goal of preserving algorithmic intent. In this paper, we present a novel directive-based approach for automatically deferring the execution of arbitrary code that depends on aggregated messages, shifting the concern of their efficient management from the application to the implementation. We demonstrate our approach using two applications from the bale 3.0 classic suite on the Frontier supercomputer.