Presentation
An Incremental Multi-Level, Multi-Scale Approach to Assessment of Multifidelity HPC Systems
DescriptionWith the growing complexity in architecture and the size of large-scale computing systems, monitoring and analyzing system behavior and events has become daunting. Monitoring data amounting to terabytes per day are collected by sensors housed in these massive systems at multiple fidelity levels and varying temporal resolutions. In this work, we develop an incremental version of multiresolution dynamic mode decomposition (mrDMD), which converts high-dimensional data to spatial-temporal patterns at varied frequency ranges. Our incremental implementation of the mrDMD algorithm (I-mrDMD) promptly reveals valuable information in the massive environment log dataset, which is then visually aligned with the processed hardware and job log datasets through our generalizable rack visualization using D3 visualization integrated into the Jupyter Notebook interface. We demonstrate the efficacy of our approach with two use scenarios on a real-world dataset from a Cray XC40 supercomputer, Theta.
Event Type
Workshop
TimeMonday, 18 November 20249:40am - 9:57am EST
LocationB311
Debugging and Correctness Tools
Performance Evaluation and/or Optimization Tools
W
Archive
view

