Close

Presentation

Compression for Scientific Data
DescriptionLarge-scale numerical simulations, observations, experiments, and AI computations generate or consume very large datasets. Data compression is an efficient technique to reduce scientific datasets and make them easier to analyze, store, and transfer. The first part of this one-day tutorial reviews the motivations, principles, techniques, and error analysis methods for lossy compression of scientific datasets. It details the main compression stages (decorrelation, approximation, coding) and their variations in state-of-the-art generic lossy compressors: SZ, ZFP, MGARD, and SPERR. The second part of the tutorial focuses on lossy compression trustability, hands-on sessions, and customization of lossy compression to respond to user-specific lossy compression constraints. In the third part of the tutorial, we discuss different ways of composing and testing specialized lossy compressors. The tutorial uses examples of real-world scientific datasets to illustrate the different compression techniques and their performance. The tutorial features 2 hours of hands-on sessions on generic compressors and how to compose specialized compressors. Participants are encouraged to bring their data to make the tutorial productive. The tutorial, given by the leading teams in this domain and primarily targeting beginners interested in learning about lossy compression for scientific data, is improved from the highly rated tutorials given at SC17-23.