Loading web-font TeX/Math/Italic
GLEAN: Generalized-Deduplication-Enabled Approximate Edge Analytics | IEEE Journals & Magazine | IEEE Xplore

GLEAN: Generalized-Deduplication-Enabled Approximate Edge Analytics


Abstract:

The Internet of Things (IoT) has brought about exponential growth in sensor data. This has led to increasing demands for efficient and novel data transmission, storage, a...Show More

Abstract:

The Internet of Things (IoT) has brought about exponential growth in sensor data. This has led to increasing demands for efficient and novel data transmission, storage, and analytics solutions for sustainable IoT ecosystems. It has been shown that the generalized deduplication (GD) compression algorithm offers not only competitive compression ratio and throughput but also random access properties that enable direct analytics of compressed data. In this article, we thoroughly stress test existing methods for direct analytics of GD compressed data with a diverse collection of 103 data sets, identify the need to optimize GD for analytics, and develop a new version of GD to this end. We also propose the generalized deduplication-enabled approximate edge analytics (GLEAN) framework. This framework applies the aforementioned analytics techniques at the Edge server to deliver end-to-end lossless data compression and high-quality Edge analytics in the IoT, thereby addressing challenges related to data transmission, storage, and analytics. Impressive analytics performance was achieved using this framework, with a median increase in k -means clustering error of just 2% relative to analytics performed on uncompressed data, while running 7.5\times faster and requiring 3.9\times less storage at the Edge server compared to universal compressors.
Published in: IEEE Internet of Things Journal ( Volume: 10, Issue: 5, 01 March 2023)
Page(s): 4006 - 4020
Date of Publication: 11 April 2022

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.