Presentation
AI Surrogate Model for Distributed Computing Workloads
SessionAI4S: 5th Workshop on Artificial Intelligence and Machine Learning for Scientific Applications
DescriptionLarge-scale international scientific collaborations, such as ATLAS, generate vast volumes of data, necessitating substantial computational power. Centralized workflow and data management systems are employed to handle these demands, but current decision-making processes for data placement and payload allocation are often heuristic and disjointed. This optimization challenge could be addressed using machine learning methods, such as reinforcement learning, which, in turn, require access to extensive data. We propose a generative surrogate modeling approach to address the lack of training data and concerns about privacy preservation. We collect and process real-world job records, and compare four generative models for tabular data---TVAE, CTAGGAN+, SMOTE, and TabDDPM---to these datasets, thoroughly evaluating their performance. Experiments indicate that SMOTE and TabDDPM generate similar tabular data to ground truth, while SMOTE ranks the lowest in privacy preservation. As a result, we conclude that the probabilistic-diffusion-model-based TabDDPM is the most suitable generative model for managing job record data.