Close

Session

ACM Student Research Competition: Graduate Poster, ACM Student Research Competition: Undergraduate Poster, Doctoral Showcase, Posters: Research, ACM SRC, and Doctoral Showcase Posters
Event TypeACM Student Research Competition: Graduate Poster, ACM Student Research Competition: Undergraduate Poster, Doctoral Showcase, Posters
TimeTuesday, 19 November 202412pm - 5pm EST
LocationB302-B305
Registration Categories
TP
XO/EX
Presentations
Interactive and Tool-Agnostic ML-Driven Workflow for Automated HPC Performance Modeling
ORCHA: A Performance Portability System for Flash-X — A Multiphysics Application Software
Improving Polyhedral-Based Optimizations with Dynamic Coordinate Descent
Performance Engineering and Mesoscale-Microscale Coupling for Wind Energy Simulations
Establishing Best Practices for Applying Inline Compressed Arrays to Improve Performance in HPC
Stalls and Memory Analysis on Fujitsu A64FX and NVIDIA Grace
FortranX: Harnessing Code Generation, Portability, and Heterogeneity in Fortran
Hardware-Independent Sampling Library for CPUs and (Multi-)GPUs: hws
Fault-Tolerant Numerical Iterative Algorithms at Scale
Exploration of Super-Resolution Techniques for Image Compression
Seesaw: Elastic Scaling for Task-Based Distributed Programs
SanQus: Staleness and Quantization-Aware Full-Graph Decentralized Training in GNNs
Parallel Verification of Neural Networks Applied to Medical Imaging
Power Patterns: Understanding the Energy Dynamics of I/O for Parallel Storage Configurations
Meteorologic Real-Time Extreme Learning Machine for Pressure Prediction
On the Accuracy and Efficiency of Approximate Triangle Counting via Randomized Numerical Linear Algebra
Optimal Client Selection Algorithms for Federated Learning
Machine Learning Applications for Early-Stage Ovarian Cancer Diagnosis
A Comparison Study of Open Source LLMs for HPC Ticket Answering​
MatRIS: Performance Portable Math Library of IRIS Runtime for Multi-Device Heterogeneity
Performance of Inline Compression with Software Caching for Reducing the Memory Footprint in pySDC
GPU Compression (for Scientific Data) Done Right
Design of Reliable and Efficient Syscall Hooking Library for a Parallel File System
Computational Radiation Hydrodynamics with FleCSI
Characterizing the Performance of the GENE-X Code for Gyrokinetic Turbulence Simulations
Turbocharging Dask Apps: Accelerating Data Flow with ProxyStore
NetCDFaster: A Geospatial Cyberinfrastructure Enhancing Multi-Dimensional Scientific Dataset Access and Visualization Through Machine Learning Optimization
Assessing the Impact of Real-Time Traffic Updates on Traffic Flow: A High-Performance Computing Perspective on Scalability and Demand
PINE: Efficient Yet Effective Piecewise Linear Trees
JACC: HPC Meta-Programming and Performance Portability Ecosystem for Julia Language
Profiling the Impact of Hyper-Threading on Pagosa Hydrocodes
Neural Network Optimization and Performance Analysis for Real-Time Object Detection at the Edge
Exploiting Data Compression and Low Precision for Exascale Fusion Turbulence Simulations
Enhancing HPC Resource Management to Integrate Quantum Workflows
LM-Offload: Performance Model-Guided Generative Inference of Large Language Models with Parallelism Control
QDD: Multi-Node Implementation of Decision Diagram-Based Quantum Circuit Simulator with Ring Communication and Auto SWAP Insertion
PerfFlowAspect: A User-Friendly Performance Tool for Scientific Workflows
Predicting Dataset Popularity for Improved Distributed Content Caching in High Energy Physics
Enhancing the Traditional Benchmarks for Parallel Computing Education
FAS-GED: GPU-Accelerated Graph Edit Distance Computation
Scored Non-Deterministic Finite Automata Processor for Sequence Alignment
Large-Scale Randomized Program Generation with Large Language Models​
Comparing Cache Utilization Trends for Regional Scientific Caches with Transfer Learning Models
Analyzing Alltoall Algorithms with SST
Enhancing Performance Reproducibility on HPC Workflows
A Novel Gradient Compression Design with Ultra-High Compression Ratio for Communication-Efficient Federated Learning
PipeInfer: Accelerating LLM Inference Using Asynchronous Pipelined Speculation​
Breaking the Barriers to Effective Supercomputing: Web Dashboard for Job Accounting and Performance Metrics
A Sparse Approach for Translation-Based Training of Knowledge Graph Embeddings
Scalable Low-Latency Hardware Function Chaining with Chain Control Circuit
HARVEST-2.0: High-Performance Vision Framework for End-to-End Preprocessing, Training, Inference, and Visualization
Exploring Fine-Grained Memory Analysis for PIM Offloading
Uncover the Overhead and Resource Usage for Handling KV Cache Overflow in LLM Inference
Scalable Performance and Accuracy Analysis for Distributed and Extreme-Scale Systems
Improving the Performance of Proof-of-Space in Blockchain Systems
Communication Hiding for Matrix-Free Finite Element Operators of a Complex PDE: Nonlinear Stokes Flow of Earth’s Mantle
Increasing the Efficiency of Neutral Atoms by Reducing Qubit Waste from Measurement-Related Ejections
The P3 Explorer: Exploring the Performance, Portability, and Productivity Wilderness
GNN-RL: An Intelligent HPC Resource Scheduler
PcMINER: Mining Performance-Related Commits at Scale
Profiling Communication Overhead in 3D Parallel Pretrain of Large Language Models
An Accurate and Scalable Multidimensional Quantum Solver for Partial Differential Equations
Generalizing ExaDigiT Datacenter Digital Twin Framework for Multiple Architectures
Edge-Enabled Real-Time Data Processing in Power-Efficient Weather Stations Using IBIS
CoVA: Compiler for Versatile Architectures
Trusted Platform Provisioning for the OpenCHAMI Cluster Management Stack
Parallelization of the Finite Element-Based Mesh Warping Algorithm Using Hybrid Parallel Programming
QFw: A Quantum Framework for Large-Scale HPC Ecosystems
Efficient Approaches to Analyzing Large Dynamic Networks
JUmPER: Performance Data Monitoring, Instrumentation and Visualization for Jupyter Notebooks
Memory Disaggregation in Serverless Computing
Evolving a Multi-Population Evolutionary-QAOA on Distributed QPUs
SWARM: Scientific Workflow Applications on Resilient Metasystem
Benchmarking and Modeling of Producer-Consumer Data Movement Performance in Scientific Workflows
Cluster-Based Methodology for Characterizing the Performance of Portable Applications
DART-X: Software Infrastructure for Prototyping In-Memory Data Transfer Between Ensemble Data Assimilation and Coupled Earth Systems Models
HPC Fastpass: Visualizing Descriptive and Predictive HPC Queue Time Data
MIGnificient: Fast, Isolated, and GPU-Enabled Serverless Functions
iSeeMore: Design of a 256-Node RPi Cluster to Visualize LLM Computation Through Light and Movement for Mass Audiences
5G in Practice: Measuring Emerging Wireless Technology in Rural Iowa for Edge Devices in Distributed Computation Workloads
Trackable Agent-Based Evolution Models at Wafer Scale
New Semi-Implicit Electrostatic Particle-In-Cell Method to Extend Scope of the Exascale WarpX Code
KVSort: Drastically Improving LLM Inference Performance via KV Cache Compression
Web-Based Simulator of Superscalar RISC-V Processors
Quantum Volume Benchmarking Simulators on HPC Systems
Simplifying HPC Resource Selection: A Tool for Optimizing Execution Time and Cost on Azure
Exploring DAOS as a Burst Buffer for a 100 Gbps DAQ Real-Time Streaming System
Persistent and Partitioned MPI for Stencil Communication
An Adaptive Kernel Execution for Dynamic Applications on GPUs Using CUDA Graphs
Mind Your Manners: Detoxifying Language Models via Attention Head Intervention
Bringing It HOME: Analyzing Contention Hotspots Across the Memory Hierarchy with Low Overhead
Active Learning for Metamaterial Optimization on HPC and QC Integrated Systems
An Error-Bounded Lossy Compression Method with Bit-Adaptive Quantization for Particle Data
Creating Code LLMs for HPC: It’s LLMs All the Way Down
Towards Scalable Quantum Simulation on Wafer-Scale Engines
I/O Characterization of Heterogeneous Workflows
RAPIDS: Reduced API Data-Transfer Specifications
A Zero-Copy Storage with Metadata-Driven File Management Using Persistent Memory
Exploring Software-Defined Networking for Routing in Dragonfly Topology
Cluster Management with Containerization on Switches
Benchmarking Quantum-Inspired Optimization Platforms and Tools on an HPC Cluster
Development of TEZip in PyTorch: Integrating New Prediction Models into an Existing Compression Framework
Improvement of Bridges-2 Resource Utilization Through User Optimization
Profiling and Bottleneck Identification for Large Language Model Optimizations
A Survey-Based Evaluation of the Efficacy of a Girls Who Code Club at the University of Southern Indiana
Improving SpGEMM Performance Through Reordering and Cluster-Wise Computation
Prompt Phrase Ordering Using Large Language Models in HPC: Evaluating Prompt Sensitivity
Proposal for a Parallel Automatic Tuning Using d-Spline According to the Operating State of the Computer System
Assessing Matrix Multiplication Performance with Fully Homomorphic Encryption
Generating Coupled Cluster Code for Modern Distributed Memory Tensor Software
Formal Approaches to Characterize Emerging Arithmetic Realizations
Integrating HPCToolkit with Tools for Automated Analysis
Performance of LAMMPS-SNAP in Different Runtime Environments
Algorithmic Patterns from Computational Biology for Proxy Application Development and Co-Design
Performance of N10 Benchmarks with Different BLAS Implementations
Eve: Less Memory, Same Might
Lagrangian Particle-Tracking in GPU-Enabled Extreme Scale Turbulence Simulations
Scalable Motif Counting on Large-Scale Dynamic Graphs
Author
Empowering Scientific Datasets with Large Language Models
DisCostiC: Simulating MPI Applications Without Executing Code
Identifying Regions of Non-Determinism in HPC Simulations Through Event Graph Alignment
Fault Tolerance in Krylov Subspace Methods
Prototype Development and Testing of a Smart Buoy System for Coastal and Marine Ecosystems Using IBIS
AI-Based Scalable Analytics for Improving Performance and Resilience of HPC Systems
Large Genomic Language Models: Towards Their Hyperparameter Optimization
Poseidon: A Source-to-Source Translator for Holistic HPC Optimization of Ocean Models on Regular Grids
Algorithmic and Optimization Techniques for Graph Applications in Heterogeneous Systems at Scale
Efficient, Scalable, Robust Neuromorphic High Performance Computing
Going Beyond the Chicken and Egg Situation with Modern MPI Features
Effects of Lossy Compression Data on Machine Learning Models
Scalable Planning Platform for Orchestration of Autonomous Systems Across Edge-Cloud Continuum
Author
Toward Performance & Portability & Productivity in Parallel Programming
Author
Efficient Large Dynamic Graph Analysis on Emerging Storage Technology
Enhancing HPC I/O Performance: Leveraging Runtime and Offline I/O Optimization Frameworks
Q-NFSO: Exploring Quantum Applications, Noise Management, Fault Injection, Resource Scheduling and Optimization in the NISQ Era
Data Layout Optimizations for Tensor Applications
Designing Efficient Data Reduction Approaches for Multi-Resolution Simulations on HPC Systems
FFT-Based Spherical Harmonics and Radial Transforms on GPU
High-Performance Computing Resilience Analysis Using Large Language Models
Supporting End Users in Implementing Quantum Computing Applications
Accelerating Communications in High-Performance Scientific Workflows
Accelerating HPC Workflow Results and Performance Reproducibility Analytics