Abstract
As various types of data grow explosively, large-scale data storage, backup, and transmission become challenging, which motivates many researchers to propose efficient universal compression algorithms for multi-source data. In recent years, due to the emergence of hardware acceleration devices such as GPUs, TPUs, DPUs, and FPGAs, the performance bottleneck of neural networks (NN) has been overcome, making NN-based compression algorithms increasingly practical and popular. However, the research survey for the NN-based universal lossless compressors has not been conducted yet, and there is also a lack of unified evaluation metrics. To address the above problems, in this paper, we present a holistic survey as well as benchmark evaluations. Specifically, i) we thoroughly investigate NN-based lossless universal compression algorithms toward multi-source data and classify them into 3 types: static pre-training, adaptive, and semi-adaptive. ii) We unify 19 evaluation metrics to comprehensively assess the compression effect, resource consumption, and model performance of compressors. iii) We conduct experiments more than 4600 CPU/GPU hours to evaluate 17 state-of-the-art compressors on 28 real-world datasets across data types of text, images, videos, audio, etc. iv) We also summarize the strengths and drawbacks of NN-based lossless data compressors and discuss promising research directions. We summarize the results as the NN-based Lossless Compressors Benchmark (NNLCB, See fahaihi.github.io/NNLCB website), which will be updated and maintained continuously in the future.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Schaller R R. Moore’s law: past, present and future. IEEE Spectrum, 1997, 34(6): 52–59
Rydning J. Global DataSphere, Data Marketplaces, and Data as a Service. See idc.com/getdoc.jsp?containerId=IDC_P38353 website, 2023
Sun H, Ma H, Zheng Y, Xie H, Wang X, Liu X. SR2C: a structurally redundant short reads collapser for optimizing DNA data compression. In: Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems. 2023, 60–67
Ji Z, Zhou J R, Jiang L, Wu Q H. Overview of DNA sequence data compression techniques. Acta Electronica Sinica, 2010, 38(5): 1113–1121
Numanagić I, Bonfield J K, Hach F, Voges J, Ostermann J, Alberti C, Mattavelli M, Sahinalp S C. Comparison of high-throughput sequencing data compression tools. Nature Methods, 2016, 13(12): 1005–1008
Kredens K V, Martins J V, Dordal O B, Ferrandin M, Herai R H, Scalabrin E E, Ávila B C. Vertical lossless genomic data compression tools for assembled genomes: a systematic literature review. PLoS One, 2020, 15(5): e0232942
Sun H, Zheng Y, Xie H, Ma H, Liu X, Wang G. PMFFRC: a large-scale genomic short reads compression optimizer via memory modeling and redundant clustering. BMC Bioinformatics, 2023, 24(1): 454
Sun H, Zheng Y, Xie H, Ma H, Zhong C, Yan M, Liu X, Wang G. PQSDC: a parallel lossless compressor for quality scores data via sequences partition and run-length prediction mapping. Bioinformatics, 2024, 40(5): btae323
Ai D, Lu H Y, Yang Y R, Liu Y H, Lu J, Liu Y. A brief overview 3D point cloud data compression technology. Journal of Xi’an University of Posts and Telecommunications, 2021, 26(1): 90–96
Chen X, Tian J, Beaver I, Freeman C, Yan Y, Wang J, Tao D. FCBench: cross-domain benchmarking of lossless compression for floating-point data. Proceedings of the VLDB Endowment, 2024, 17(6): 1418–1431
Mishra D, Singh S K, Singh R K. Deep architectures for image compression: a critical review. Signal Processing, 2022, 191: 108346
Jamil S, Piran M J, Rahman M U, Kwon O J. Learning-driven lossy image compression: a comprehensive survey. Engineering Applications of Artificial Intelligence, 2023, 123: 106361
Bourai N E H, Merouani H F, Djebbar A. Deep learning-assisted medical image compression challenges and opportunities: systematic review. Neural Computing and Applications, 2024, 36(17): 10067–10108
Tian T, Wang H. Large-scale video compression: recent advances and challenges. Frontiers of Computer Science, 2018, 12(5): 825–839
Im S K, Ghandi M M. Improved rate-distortion optimized video coding using non-integer bit estimation and multiple Lambda search. Frontiers of Computer Science, 2016, 10(1): 157–166
Lasse C. The official website of the XZ compressor. See tukaani.org/xz/ website, 2015
Meta. Zstandard-Fast real-time compression algorithm. See facebook/zstd: Zstandard - Fast real-time compression algorithm website, 2024
Google. Brotli compression format. See github.com/google/brotli website, 2024
IlyaGrebnov. High performance block-sorting data compression library. See github.com/IlyaGrebnov/libbsc website, 2024
Michael. szip homepage. See compressconsult.com/szip/ website, 2002
Julian S. The official website of the Bzip2 compressor. See sourceware.org/bzip2/ website, 2019
Mahoney M. Incremental journaling backup utility and archiver. See mattmahoney.net/dc/zpaq website, 2016
mathieuchartier. MCMfile compressor. See github.com/mathieuchartier/mcm website, 2016
Margaritov. Hutter prize submission 2021a: STARLIT + cmix. See github.com/amargaritov/starlit website, 2021
Aslanyurek M, Mesut A. A static dictionary-based approach to compressing short texts. In: Proceedings of the 6th International Conference on Computer Science and Engineering. 2021, 342–347
Kasneci E, Sessler K, Kuchemann S, Bannert M, Dementieva D, Fischer F, Gasser U, Groh G, Günnemann S, Hüllermeier E, Krusche S, Kutyniok G, Michaeli T, Nerdel C, Pfeffer J, Poquet O, Sailer M, Schmidt A, Seidel T, Stadler M, Weller J, Kuhn J, Kasneci G. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 2023, 103: 102274
Wei C, Wang Y C, Wang B, Kuo C C J. An overview on language models: recent developments and outlook. 2023, arXiv preprint arXiv: 2303.05759
Thirunavukarasu A J, Ting D S J, Elangovan K, Gutierrez L, Tan T F, Ting D S W. Large language models in medicine. Nature Medicine, 2023, 29(8): 1930–1940
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735–1780
Yu Y, Si X, Hu C, Zhang J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Computation, 2019, 31(7): 1235–1270
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6000–6010
Huang Y, Xu J, Lai J, Jiang Z, Chen T, Li Z, Yao Y, Ma X, Yang L, Chen H, Li S, Zhao P. Advancing transformer architecture in long-context large language models: a comprehensive survey. 2024, arXiv preprint arXiv: 2311.12351
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I. Language models are unsupervised multitask learners. OpenAI Blog, 2019, 1(8): 9
Devlin J, Chang M W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018, 4171–4186
Gu A, Dao T. Mamba: linear-time sequence modeling with selective state spaces. 2024, arXiv preprint arXiv: 2312.00752
Beck M, Pöppel K, Spanring M, Auer A, Prudnikova O, Kopp M, Klambauer G, Brandstetter J, Hochreiter S. xLSTM: extended long short-term memory. 2024, arXiv preprint arXiv: 2405.04517
Mao Y, Cui Y, Kuo T W, Xue C J. TRACE: a fast transformer-based general-purpose lossless compressor. In: Proceedings of ACM Web Conference 2022. 2022, 1829–1838
Mao Y, Cui Y, Kuo T W, Xue C J. Accelerating general-purpose lossless compression via simple and scalable parameterization. In: Proceedings of the 30th ACM International Conference on Multimedia. 2022, 3205–3213
Mao Y, Li J, Cui Y, Xue J C. Faster and stronger lossless compression with optimized autoregressive framework. In: Proceedings of the 60th ACM/IEEE Design Automation Conference. 2023, 1–6
Zhong C, Sun H. Parallel algorithm for sensitive sequence recognition from long-read genome data with high error rate. Journal on Communications, 2023, 44(2): 160–171
Sayood K. Introduction to Data Compression. 5th ed. Sydney: Morgan Kaufmann, 2017
Shannon C E. A mathematical theory of communication. The Bell System Technical Journal, 1948, 27(3): 379–423
Moffat A. Huffman coding. ACM Computing Surveys (CSUR), 2019, 52(4): 85
Langdon G G. An introduction to arithmetic coding. IBM Journal of Research and Development, 1984, 28(2): 135–149
Ziv J, Lempel A. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 1977, 23(3): 337–343
Schindler M. A fast block-sorting algorithm for lossless data compression. In: Proceedings of 1997 Data Compression Conference. 1997, 469
Capon J. A probabilistic model for run-length coding of pictures. IRE Transactions on Information Theory, 1959, 5(4): 157–163
Smith C A. A survey of various data compression techniques. International Journal of Recent Technology Engineering, 2010, 2(1): 1–20
Jayasankar U, Thirumal V, Ponnurangam D. A survey on data compression techniques: from the perspective of data quality, coding schemes, data type and applications. Journal of King Saud University-Computer and Information Sciences, 2021, 33(2): 119–140
Chiarot G, Silvestri C. Time series compression survey. ACM Computing Surveys, 2023, 55(10): 1–32
Holtz K. The evolution of lossless data compression techniques. In: Proceedings of WESCON’ 93. 1993, 140–145
Kimura N, Latifi S. A survey on data compression in wireless sensor networks. In: Proceedings of International Conference on Information Technology: Coding and Computing. 2005, 8–13
Chew L W, Ang L M, Seng K P. Survey of image compression algorithms in wireless sensor networks. In: Proceedings of 2008 International Symposium on Information Technology. 2008, 1–9
Me S S, Vijayakuymar V R, Anuja R. A survey on various compression methods for medical images. International Journal of Intelligent Systems and Applications (IJISA), 2012, 4(3): 13–19
Hosseini M. Data compression algorithms and their applications. See scribd.com/document/77511910/Data-Compression-Algorithms-and-Their-Applications website, 2012
Srisooksai T, Keamarungsi K, Lamsrichan P, Araki K. Practical data compression in wireless sensor networks: a survey. Journal of Network and Computer Applications, 2012, 35(1): 37–59
Sharma N, Kaur J, Kaur N. A review on various Lossless text data compression techniques. Research Cell: An International Journal of Engineering Sciences, 2014, 2: 58–63
Zhu Z, Zhang Y, Ji Z, He S, Yang X. High-throughput DNA sequence data compression. Briefings in Bioinformatics, 2015, 16(1): 1–15
Hernaez M, Pavlichin D, Weissman T, Ochoa I. Genomic data compression. Annual Review of Biomedical Data Science, 2019, 2: 19–37
Kryukov K, Ueda M T, Nakagawa S, Imanishi T. Sequence compression benchmark (SCB) database—A comprehensive evaluation of reference-free compressors for FASTA-formatted sequences. GigaScience, 2020, 9(7): giaa072
Gilmary R, Venkatesan A, Vaiyapuri G. Compression techniques for DNA sequences: a thematic review. Journal of Computing Science and Engineering, 2021, 15(2): 59–71
Sun H, Ma H, Zheng Y, Xie H, Yan M, Zhong C. LRCB: a comprehensive benchmark evaluation of reference-free lossless compression tools for genomics sequencing long reads data. In: Proceedings of 2024 Data Compression Conference. 2024, 584
Singh B, Kaur A, Singh J. A review of ECG data compression techniques. International Journal of Computer Applications, 2015, 116(11): 39–44
Rajankar S O, Talbar S N. An electrocardiogram signal compression techniques: a comprehensive review. Analog Integrated Circuits and Signal Processing, 2019, 98(1): 59–74
Kumar P, Parmar A. Versatile approaches for medical image compression: a review. Procedia Computer Science, 2020, 167: 1380–1389
Patidar G, Kumar S, Kumar D. A review on medical image data compression techniques. In: Proceedings of the 2nd International Conference on Data, Engineering and Applications. 2020, 1–6
Seeli D J J, Thanammal K K. A comparative review and analysis of medical image encryption and compression techniques. Multimedia Tools and Applications, 2024
Wen L, Zhou K, Yang S, Li L. Compression of smart meter big data: a survey. Renewable and Sustainable Energy Reviews, 2018, 91: 59–69
Prokop K, Bien A, Barczentewicz S. Compression techniques for realtime control and non-time-critical big data in smart grids: a review. Energies, 2023, 16(24): 8077
Tcheou M P, Lovisolo L, Ribeiro M V, da Silva E A B, Rodrigues M A M, Romano J M T, Diniz P S R. The compression of electric signal waveforms for smart grids: state of the art and future trends. IEEE Transactions on Smart Grid, 2014, 5(1): 291–302
Sheltami T, Musaddiq M, Shakshuki E. Data compression techniques in wireless sensor networks. Future Generation Computer Systems, 2016, 64: 151–162
Sandhya Rani I, Venkateswarlu B. A systematic review of different data compression technique of cloud big sensing data. In: Proceedings of the 2nd International Conference on Computer Networks and Communication Technologies. 2020, 222–228
Ketshabetswe K L, Zungeru A M, Mtengi B, Lebekwe C K, Prabaharan S R S. Data compression algorithms for wireless sensor networks: a review and comparison. IEEE Access, 2021, 9: 136872–136891
Correa J D A, Pinto A S R, Montez C. Lossy data compression for IoT sensors: a review. Internet of Things, 2022, 19: 100516
De Romarategui D G F. Compressing network data with deep learning. Universitat Politècnica de Catalunya, Dissertation, 2024
Kaur R, Chana I, Bhattacharya J. Data deduplication techniques for efficient cloud storage management: a systematic review. The Journal of Supercomputing, 2018, 74(5): 2035–2085
Cappello F, Di S, Li S, Liang X, Gok A M, Tao D, Yoon C H, Wu X C, Alexeev Y, Chong F T. Use cases of lossy compression for floatingpoint data in scientific data sets. The International Journal of High Performance Computing Applications, 2019, 33(6): 1201–1220
Schmidhuber J, Heil S. Sequential neural text compression. IEEE Transactions on Neural Networks, 1996, 7(1): 142–146
Goyal M, Tatwawadi K, Chandak S, Ochoa I. DZip: improved general-purpose lossless compression based on novel neural network modeling. In: Proceedings of 2020 Data Compression Conference. 2020, 372–372
Mahoney M V. Fast text compression with neural networks. In: Proceedings of the 13th International Florida Artificial Intelligence Research Society Conference. 2000, 230–234
Delétang G, Ruoss A, Duquenne P A, Catt E, Genewein T, Mattern C, Grau-Moya J, Wenliang L K, Aitchison M, Orseau L, Hutter M, Veness J. Language modeling is compression. In: Proceedings of the 12th International Conference on Learning Representations. 2024
Goyal M, Tatwawadi K, Chandak S, Ochoa I. DeepZip: lossless data compression using recurrent neural networks. In: Proceedings of 2019 Data Compression Conference. 2019, 575
Liu Q, Xu Y, Li Z. DecMac: a deep context model for high efficiency arithmetic coding. In: Proceedings of 2019 International Conference on Artificial Intelligence in Information and Communication. 2019, 438–443
Valmeekam C S K, Narayanan K, Kalathil D, Chamberland J F, Shakkottai S. LLMZip: lossless text compression using large language models. 2023, arXiv preprint arXiv: 2306.04050
Byronknoll. Cmix. See github.com/byronknoll/cmix website, 2024
Bell T, Witten I H, Cleary J G. Modeling for text compression. ACM Computing Surveys (CSUR), 1989, 21(4): 557–591
Burrows M, Wheeler D J. A block-sorting lossless data compression algorithm. Palo Alto: Systems Research Center, 1994: 124
Mahoney M V. Adaptive weighing of context models for lossless data compression. See mattmahoney.net/dc/cs200516.pdf, 2005
Gailly J L, Adler M, GZip offical website. See gnu.org/software/gzip/manual/ website, 2023
Roshal E. RAR offical website. See rarlab.com/ website, 2024
LZMA2 Official Website. LZMA2. See 7-zip website, 2024
Boutell T. RFC2083: Png (portable network graphics) specification version 1.0. RFC Editor. See dl.acm.org/doi/pdf/10.17487/RFC2083, 1997
Coalson J. Free lossless audio codec. See xiph.org/flac website, 2023
Hoffmann J, Borgeaud S, Mensch A, Buchatskaya E, Cai T, Rutherford E, de Las Casas D, Hendricks L A, Welbl J, Clark A, Hennigan T, Noland E, Millican K, van den Driessche G, Damoc B, Guy A, Osindero S, Simonyan K, Elsen E, Rae J W, Vinyals O, Sifre L. Training compute-optimal large language models. 2022, arXiv preprint arXiv: 2203.15556
Zaheer M, Guruganesh G, Dubey A, Ainslie J, Alberti C, Ontanon S, Pham P, Ravula A, Wang Q, Yang L, Ahmed A. Big bird: Transformers for longer sequences. In: Proceedings of the 34th Conference on Neural Information Processing Systems. 2020, 17283–17297
Mahoney M. Large text compression benchmark. See mattmahoney.net/dc/text website, 2006
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A C, Fei-Fei L. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115(3): 211–252
Panayotov V, Chen G, Povey D, Khudanpur S. Librispeech: an ASR corpus based on public domain audio books. In: Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal processing. 2015, 5206–5210
Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M A, Lacroix T, Rozière B, Goyal N, Hambro E, Azhar F, Rodriguez A, Joulin A, Grave E, Lample G. LlaMA: open and efficient foundation language models. 2023, arXiv preprint arXiv: 2302.13971
Gailly J L, Adler M. zlib. See zlib.net/ website, 2024
Rhatushnyak A. PAQ8H. See mattmahoney.net/dc/paq website, 2006
byronknoll. Lstm-compress. See github.com/byronknoll/lstm-compress website, 2017
Bellard F. NNCP. See bellard.org/nncp/nncp website, 2019
Likhosherstov V, Choromanski K, Davis J, Song X, Weller A. Sublinear memory: How to make performers slim. In: Proceedings of the 35th Conference on Neural Information Processing Systems. 2021, 6707–6719
Knoll B. TensorFlow-compress. See github.com/byronknoll/tensorflow-compress website, 2020
Ma Y, Yu D, Wu T, Wang H. PaddlePaddle: an open-source deep learning platform from industrial practice. Frontiers of Data & Computing, 2019, 1(1): 105–115
Pang B, Nijkamp E, Wu Y N. Deep learning with TensorFlow: a review. Journal of Educational and Behavioral Statistics, 2020, 45(2): 227–248
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S. PyTorch: an imperative style, high-performance deep learning library. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. 2019, 32
Wang D, Cui W. An efficient graph data compression model based on the germ quotient set structure. Frontiers of Computer Science, 2022, 16(6): 166617
Xing Y, Li G, Wang Z, Feng B, Song Z, Wu C. GTZ: a fast compression and cloud transmission tool optimized for FASTQ files. BMC Bioinformatics, 2017, 18(16): 549
Deorowicz S. Silesia corpus. See github.com/MiloszKrajewski/Silesia Corpus website, 2018
LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278–2324
Krizhevsky A. Learning multiple layers of features from tiny images. See cs.toronto.edu/~kriz/learning-features-2009-TR.pdf website, 2009
Image Compression Benchmark official website. See imagecompression.info/ website, 2015
Piczak K J. ESC: dataset for environmental sound classification. In: Proceedings of the 23rd ACM international conference on Multimedia. 2015, 1015–1018
Warden P. Speech commands: a dataset for limited-vocabulary speech recognition. 2018, arXiv preprint arXiv: 1804.03209
Ito K, Johnson L. The LJ speech dataset. See keithito.com/LJ-Speech-Dataset website, 2017
Pratas D, Pinho A J. A DNA sequence corpus for compression benchmark. In: Proceedings of the 12th International Conference on Practical Applications of Computational Biology & Bioinformatics. 2019, 208–215
Geer L Y, Marchler-Bauer A, Geer R C, Han L, He J, He S, Liu C, Shi W, Bryant S H. The NCBI biosystems database. Nucleic Acids Research, 2010, 38(suppl_1): D492–D496
PBzip2. PBzip2. See launchpad.net/pbzip2 website, 2009
Takehiro K. SnZip Official Website. See github.com/kubo/snzipwebsite, 2021
PPMD Official Website. PPMD. See 7-zip website, 2010
Barina D, Klima O. X3: lossless data compressor. In: Proceedings of 2022 Data Compression Conference. 2022, 441
Barina D. Experimental lossless data compressor. Microprocessors and Microsystems, 2023, 98: 104803
LZ4. LZ4 official website. See github.com/lz4/lz4 website, 2024
Qin L, Sun J. Model compression for data compression: neural network based lossless compressor made practical. In: Proceedings of 2023 Data Compression Conference. 2023, 52–61
Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 1996, 58(1): 267–288
Acknowledgements
This work was partly supported by the National Natural Science Foundation of China (Grant Nos. 62272253 and 62272252) and the Fundamental Research Funds for the Central Universities. It was also supported in part by the China Scholarship Council (CSC202406200085) and the Innovation Project of Guangxi Graduate Education (YCBZ2024005). The High-performance Computing Center of Guangxi University partly supported the experimental work. The authors thank the editor and anonymous reviewers for their constructive comments and suggestions for improving our manuscript.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.
Additional information
Electronic supplementary material Supplementary material is available in the online version of this article at journal.hep.com.cn and link.springer.com.
Hui SUN obtained his BSc and MSc degrees in information security and computer science from China University of Mining and Technology and Guangxi University, China in 2019 and 2022, respectively. He is currently pursuing the PhD degree at the College of Computer Science, Nankai University, China, and is also a visiting student at the College of Computing and Data Science (CCDS), Nanyang Technological University (NTU), Singapore. His research interests include AI for data compression, deep learning, and parallel computing. He has authored technical conferences and journal papers in DCC, ICPADS, ISBRA, Journal on Communications, Journal of Chinese Computer Systems, Bioinformatics, BMC Bioinformatics, etc.
Huidong MA obtained his BSc and MSc degrees in computer science from Hainan University, China and Guangxi University, China in 2020 and 2023, respectively. He is currently pursuing a PhD degree in computer science at Nankai University, China. His main research interests include data storage systems, machine learning, AI for data compression, and large language models. He has authored technical papers in conferences and journals, such as DCC, ICPADS, ISBRA, Bioinformatics, and BMC Bioinformatics.
Feng LING obtained a BSc degree in information security from Northeast University, China in 2022. He is currently pursuing a MSc degree at Nankai University, China. He is a member of the Nankai-Baidu Joint Laboratory and the Parallel and Distributed Software Technology Laboratory. His research interests include AI for data compression, high-performance computing, parallel algorithm design, and neural network inference frameworks.
Haonan XIE is currently pursuing a PhD degree in automation at the School of Electrical Engineering, Guangxi University, China. He is a CAAI member, IEEE member, and IET member. His main research interests include artificial intelligence-engaged energy conversion, systems engineering modeling, and compression & management of big data in the power industry. He has authored technical papers in RSER, AE, DCC, ICPADS, Bioinformatics, and BMC Bioinformatics journals and conferences.
Yongxia SUN received her BSc and MSc degrees in information management & system and computer technology from Tianjin Agricultural University, China and Hunan University of Technology and Business, China. She is currently pursing the PhD degree at college of computer science, Nankai University, China. She is a member of the Nankai-Baidu Joint Laboratory and the Parallel and Distributed Software Technology Laboratory. Her research interests include data compression and storage, data auditing, machine learning, and blockchain application. She has authored technical papers in Computer Networks, CMC, etc.
Liping YI is currently pursuing the PhD degree at the College of Computer Science, Nankai University, China and is also a visiting student at the College of Computing and Data Science (CCDS), Nanyang Technological University (NTU), Singapore. Her research interests include federated learning, she has authored technical papers in NeurIPS, ICML, MM, IJCAI, ICASSP, ICWS, DASFAA, etc. conferences and TSC, TMC, KBS journals. She served as the reviewer of NeurIPS, ICML, ICLR, KDD, AAAI, IJCAI, CVPR, MM, ICASSP, ICME, FL-IJCAI’23 workshop, FL@FM-NeurIPS’23 workshop, FL@FM-TheWebConf’4 workshop, FL@FM-ICME’24 Workshop conferences, and TMC, TNNLS, TGCN, Neurocomputing journals.
Meng YAN obtained her BSc and PhD degrees in physics and computer science from Tianjin University and Nankai University, China in 2014 and 2022, respectively. She is an assistant professor at Nankai University, conducting postdoctoral research at the Nankai-Baidu Joint Laboratory. Her main research areas include blockchain, machine learning, and AI for data compression. She has published technical papers on SRDS, Chinese Journal of Electronics, Bioinformatics, etc. She is currently a member of editor board for Journal Blockchain.
Cheng ZHONG received the PhD degree in computer science and technology from the University of Science and Technology of China, China. He is currently a professor with the School of Computer, Electronics and Information at Guangxi University, China. He has been hosted several national and provincial research projects. He has published more than 150 journal/conference papers and edited 5 books. His research interests include parallel computing, bioinformatics, distributed computing, and information security. He is an outstanding member of Chinese Computer Federation.
Xiaoguang LIU received his BSc, MSc, and PhD degrees in computer science from Nankai University, China in 1996, 1999, and 2002, respectively. He is currently a professor at the Department of Computer Science, Nankai University, China. His research interests include search engines, storage systems, GPU computing, and federated learning. He has authored technical papers in DCC, ICML, AAAI, IJCAI, WWW, SIGIR, VLDB conferences and TC, TPDS, TOS, TKDE, TDSC, TMM, TNNLS, TCSVT journals, etc.
Gang WANG received his BSc, MSc, and PhD degrees in computer science from Nankai University, China in 1996, 1999 and 2002, respectively. He is currently a professor at the Department of Computer Science, Nankai University, China. His research interests include parallel computing, storage systems, data mining, machine learning, and federated learning. He has authored technical papers in ICML, AAAI, IJCAI, WWW, SIGIR, DCC, VLDB, ACM MM conferences and TC, TPDS, TOS, TKDE, TDSC, TNNLS, TCSVT journals, etc.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sun, H., Ma, H., Ling, F. et al. A survey and benchmark evaluation for neural-network-based lossless universal compressors toward multi-source data. Front. Comput. Sci. 19, 197360 (2025). https://doi.org/10.1007/s11704-024-40300-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-024-40300-5