short-paper

OpenDMC: An Open-Source Library and Performance Evaluation for Deep-learning-based Multi-frame Compression

Authors:

Yongchi ZhangAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 9685 - 9688

https://doi.org/10.1145/3581783.3613464

Published: 27 October 2023 Publication History

Abstract

Video streaming has become an essential component of our everyday routines. Nevertheless, video data imposes a significant strain on data usage, demanding substantial bandwidth and storage resources for effective transmission. To suit explosively increasing video transmission and storage requirements, deep-learning-based video compression has developed rapidly in the past few years. New methods have mushroomed in order to achieve better Rate-Distortion (RD) performance. However, the absence of an algorithm library that can effectively sort, classify, and conduct extensive benchmark testing on existing algorithms remains a challenge. In this paper, we present an open-source algorithm library called OpenDMC, which integrates a variety of end-to-end video compression methods in cross-platform environments. We provide comprehensive descriptions of the algorithms used in the library, including their contributions and implementation details. We perform a thorough benchmarking test to evaluate the performance of the algorithms. We meticulously compare and analyze each algorithm based on various metrics, including RD performance, running time, and GPU memory usage. The open-source library for OpenDMC is available at https://openi.pcl.ac.cn/OpenDMC/.

References

[1]

Eirikur Agustsson, David Minnen, Nick Johnston, Johannes Balle, Sung Jin Hwang, and George Toderici. 2020. Scale-space flow for end-to-end optimized video compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8503--8512.

[2]

Jean Bégaint, Fabien Racapé, Simon Feltman, and Akshay Pushparaja. 2020. Compressai: a pytorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029 (2020).

[3]

G. Bjøntegaard. 2001. Calculation of average PSNR differences between RD curves. ITU-T SG16 / Q6 Doc. VCEG-M33 (2001).

[4]

Abdelaziz Djelouah, Joaquim Campos, Simone Schaub-Meyer, and Christopher Schroers. 2019. Neural inter-frame compression for video coding. In Proceedings of the IEEE/CVF international conference on computer vision. 6421--6429.

[5]

Wei Gao, Qiuping Jiang, RonggangWang, Siwei Ma, Ge Li, and Sam Kwong. 2021. Consistent quality oriented rate control in HEVC via balancing intra and inter frame coding. IEEE Transactions on Industrial Informatics 18, 3 (2021), 1594--1604.

[6]

Wei Gao, Sam Kwong, and Yuheng Jia. 2017. Joint machine learning and game theory for rate control in high efficiency video coding. IEEE Transactions on Image Processing 26, 12 (2017), 6074--6089.

Digital Library

[7]

Wei Gao, Sam Kwong, Qiuping Jiang, Chi-Keung Fong, Peter HW Wong, and Wilson YF Yuen. 2018. Data-driven rate control for rate-distortion optimization in HEVC based on simplified effective initial QP learning. IEEE Transactions on Broadcasting 65, 1 (2018), 94--108.

[8]

Wei Gao, Sam Kwong, Hui Yuan, and XuWang. 2015. DCT coefficient distribution modeling and quality dependency analysis based frame-level bit allocation for HEVC. IEEE Transactions on Circuits and Systems for Video Technology 26, 1 (2015), 139--153.

Digital Library

[9]

Wei Gao, Sam Kwong, Yu Zhou, and Hui Yuan. 2016. SSIM-based game theory approach for rate-distortion optimized intra frame CTU-level bit allocation. IEEE Transactions on Multimedia 18, 6 (2016), 988--999.

Digital Library

[10]

Wei Gao and Tak Wu Sam Kwong. 2020. Systems and methods for rate control in video coding using joint machine learning and game theory. US Patent 10,542,262.

[11]

Wei Gao, Hang Yuan, Yang Guo, Lvfang Tao, Zhanyuan Cai, and Ge Li. 2022. OpenHardwareVC: An Open Source Library for 8K UHD Video Coding Hardware Implementation. In Proceedings of the 30th ACM International Conference on Multimedia. 7339--7342.

Digital Library

[12]

Wei Gao, Hang Yuan, Guibiao Liao, Zixuan Guo, and Jianing Chen. 2023. PP8K: A New Dataset for 8K UHD Video Compression and Processing. IEEE MultiMedia (2023).

[13]

Zhihao Hu, Guo Lu, and Dong Xu. 2021. FVC: A new framework towards deep video compression in feature space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1502--1511.

[14]

Jiahao Li, Bin Li, and Yan Lu. 2021. Deep contextual video compression. Advances in Neural Information Processing Systems 34 (2021), 18114--18125.

[15]

Guo Lu,Wanli Ouyang, Dong Xu, Xiaoyun Zhang, Chunlei Cai, and Zhiyong Gao. 2019. Dvc: An end-to-end deep video compression framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11006-- 11015.

[16]

Alexandre Mercat, Marko Viitanen, and Jarno Vanne. 2020. UVG dataset: 50/120fps 4K sequences for video codec analysis and development. In Proceedings of the 11th ACM Multimedia Systems Conference. 297--302.

Digital Library

[17]

Gary Sullivan. 2020. Versatile video coding (VVC) arrives. In 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP). IEEE, 1--1.

[18]

Gary J Sullivan, Jens-Rainer Ohm, Woo-Jin Han, and Thomas Wiegand. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on circuits and systems for video technology 22, 12 (2012), 1649--1668.

Digital Library

[19]

ZhouWang, Eero P Simoncelli, and Alan C Bovik. 2003. Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, Vol. 2. Ieee, 1398--1402.

[20]

Thomas Wiegand, Gary J Sullivan, Gisle Bjontegaard, and Ajay Luthra. 2003. Overview of the H. 264/AVC video coding standard. IEEE Transactions on circuits and systems for video technology 13, 7 (2003), 560--576.

Digital Library

[21]

Ren Yang, Fabian Mentzer, Luc Van Gool, and Radu Timofte. 2020. Learning for video compression with hierarchical quality and recurrent enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6628--6637.

[22]

Ren Yang, Fabian Mentzer, Luc Van Gool, and Radu Timofte. 2020. Learning for video compression with recurrent auto-encoder and recurrent probability model. IEEE Journal of Selected Topics in Signal Processing 15, 2 (2020), 388--401.

[23]

Hang Yuan, Wei Gao, Ge Li, and Zhu Li. 2022. Rate-Distortion-Guided Learning Approach with Cross-Projection Information for V-PCC Fast CU Decision. In Proceedings of the 30th ACM International Conference on Multimedia. 3085--3093.

Digital Library

[24]

Saiping Zhang, Marta Mrak, Luis Herranz, Marc Górriz Blanch, Shuai Wan, and Fuzheng Yang. 2021. DVC-P: Deep Video Compression with Perceptual Optimizations. In 2021 International Conference on Visual Communications and Image Processing (VCIP). IEEE, 1--5.

Cited By

Liao GGao W(2024)Rethinking Feature Mining for Light Field Salient Object DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367696720:10(1-24)Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1145/3676967
Gao WZheng HZhang CZheng KYu ZLi YYe HZhang YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)OpenDIC: An Open-Source Library and Performance Evaluation for Deep-learning-based Image CompressionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3685519(11202-11205)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3685519
Zheng HGao WYu ZZhao TLi GCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)ViewPCGC: View-Guided Learned Point Cloud Geometry CompressionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681225(7152-7161)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681225
Show More Cited By

Index Terms

OpenDMC: An Open-Source Library and Performance Evaluation for Deep-learning-based Multi-frame Compression

Index terms have been assigned to the content through auto-classification.

Recommendations

Performance analysis of hybrid coders in multi-constraints pruned environment
Abstract
Advance Video Coder (H.264/AVC) and High-Efficiency Video (H.265/HEVC) coders are fast developing video compression standards, provides high compression and quality of service as compared to previously established standards. The present work ...
OpenDIC: An Open-Source Library and Performance Evaluation for Deep-learning-based Image Compression
MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Deep learning technologies have been popular in the image compression field for some time. An increasing number of deep-learning-based models are proposed to improve Rate-Distortion (RD) performance. Previous algorithms are implemented in the specific ...
Parallelization and performance evaluation of open-source HEVC codecs

High Efficiency Video Coding (HEVC) was developed by the Joint Collaborative Team on Video Coding (JCT-VC) to replace the current H.264/Advanced Video Coding (AVC) standard, which has dominated digital video services in all segments of the domestic and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

CAAI-Huawei MindSpore Open Fund
Shenzhen Fundamental Research Program
Shenzhen Science and Technology Plan Basic Research Project
Natural Science Foundation of China

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
128
Total Downloads

Downloads (Last 12 months)61
Downloads (Last 6 weeks)3

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liao GGao W(2024)Rethinking Feature Mining for Light Field Salient Object DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367696720:10(1-24)Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1145/3676967
Gao WZheng HZhang CZheng KYu ZLi YYe HZhang YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)OpenDIC: An Open-Source Library and Performance Evaluation for Deep-learning-based Image CompressionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3685519(11202-11205)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3685519
Zheng HGao WYu ZZhao TLi GCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)ViewPCGC: View-Guided Learned Point Cloud Geometry CompressionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681225(7152-7161)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681225
Zhang SZhao CBasu A(2024)Principal Component Approximation Network for Image CompressionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363749020:5(1-20)Online publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1145/3637490
Wang JGao WLi G(2024)Zoom to Perceive Better: No-Reference Point Cloud Quality Assessment via Exploring Effective Multiscale FeatureIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.336236934:7(6334-6346)Online publication date: 5-Feb-2024
https://dl.acm.org/doi/10.1109/TCSVT.2024.3362369
Gao WLi GGao WLi G(2024)Open-Source Projects for 3D Point CloudsDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_9(255-272)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_9
Gao WLi GGao WLi G(2024)Point Cloud-Language Multi-modal LearningDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_8(227-254)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_8
Gao WLi GGao WLi G(2024)Point Cloud Pre-trained Models and Large ModelsDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_7(195-225)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_7
Gao WLi GGao WLi G(2024)Deep-Learning-Based Point Cloud Analysis IIDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_6(163-193)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_6
Gao WLi GGao WLi G(2024)Deep-Learning-Based Point Cloud Analysis IDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_5(131-162)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_5
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten