Presentation
Effects of Lossy Compression Data on Machine Learning Models
DescriptionMachine learning is a fundamental tool that is incorporated in fields across academia and industry. Due to the large amounts of data needed for training machine learning models, compression is utilized because it reduces the data footprint playing a critical role in storage. Machine learning involves the use of algorithms and models to learn patterns in data allowing AI to make decisions without specific programming. On the other hand, compression utilizes encoding and decoding techniques to reduce file size. Compression can be lossy or lossless; lossy causes a loss of data while lossless preserves the data.
This dissertation explores the accuracy and scalability of machine learning when working with lossy distorted data. Performance metrics studied look at how accurately the model’s inference performs. Issues with machine learning performance on lossy data involve the following: data storage, data transfer bandwidth, and processing on the intersection between machine learning and lossy compression. Over these various issues, machine learning is examined in different domains. This work investigates how meaningful patterns in the distorted data are extracted.
The primary focus explores neural network models' ability to manage lossy compressed data and find ways to mitigate loss due to distortion, addressing machine learning across various domains including object detection, semantic segmentation, and image classification, to find the balance between compression ratio and data quality.
This dissertation explores the accuracy and scalability of machine learning when working with lossy distorted data. Performance metrics studied look at how accurately the model’s inference performs. Issues with machine learning performance on lossy data involve the following: data storage, data transfer bandwidth, and processing on the intersection between machine learning and lossy compression. Over these various issues, machine learning is examined in different domains. This work investigates how meaningful patterns in the distorted data are extracted.
The primary focus explores neural network models' ability to manage lossy compressed data and find ways to mitigate loss due to distortion, addressing machine learning across various domains including object detection, semantic segmentation, and image classification, to find the balance between compression ratio and data quality.