Close

Presentation

Scalable Performance and Accuracy Analysis for Distributed and Extreme-Scale Systems
DescriptionThe Scalable Performance and Accuracy analysis for Distributed and Extreme-scale systems (SPADE) project focuses on advancing monitoring, optimization, evaluation, and decision-making capabilities for extreme-scale systems. This poster presents efforts targeting several advanced monitoring capabilities, including developing support for AMD's new RocProfiler SDK to enable the analysis of hardware performance counters on AMD APUs, such as the MI300, which will be integrated into El Capitan. Another effort involves extending the PAPI library for heterogeneous CPU support, allowing users to simultaneously monitor the performance of chips that support both high-end and low-end processors, enabling more effective tuning between various cores. Additionally, the project includes the development of a Python version of PAPI (cyPAPI), specifically for use with frameworks and tools being developed for Python in HPC environments. This effort extends to exploring beta versions of cyPAPI with PyTorch to advance decision-making capabilities for mixed-precision tuning of machine learning applications.
Event Type
ACM Student Research Competition: Graduate Poster
ACM Student Research Competition: Undergraduate Poster
Doctoral Showcase
Posters
TimeTuesday, 19 November 202412pm - 5pm EST
LocationB302-B305
Registration Categories
TP
XO/EX