Presentation
ExpressHPC: Towards "Connected Supercomputing" Enabling On-Demand Job Execution for Disaster Resilience
DescriptionHPC systems are now going to be connected to various external data sources via high-speed networks, and on-demand execution of urgent jobs on such a "connected supercomputing" environment could be an attractive approach to supporting disaster response and recovery. Indeed, a production system for real-time forecast of tsunami damage is already in operation. Once a large-scale earthquake occurs, the seismic information is sent to Supercomputer AOBA at Tohoku University to estimate the tsunami damage most likely caused by the earthquake in several minutes. The estimation results are then sent to the Japanese government for their various kinds of decision-making. However, on-demand job execution brings several technical challenges in resource management. A naive implementation of on-demand job execution will critically decrease the system throughput. Moreover, if an urgent simulation with deadline constraints needs a large amount of computing resource, it might be impossible for a single HPC system to immediately secure computing resource enough to meet the deadline. Therefore, we have started a new research project named ExpressHPC to offer an expressway to urgent allocation of computing resources from multiple datacenters. This lightning talk will introduce the ExpressHPC project and then discuss the technical challenges and potential opportunities of research collaboration.
Presenter