research-article

Deep Feature Compression with Rate-Distortion Optimization for Networked Camera Systems

Authors:
Ademola Ikusan

Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, USA

Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, USA

https://orcid.org/0000-0003-3933-6348
View Profile

,
Rui Dai

Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, USA

Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, USA

https://orcid.org/0000-0001-6620-7862
View Profile

MMSys '23: Proceedings of the 14th ACM Multimedia Systems ConferenceJune 2023Pages 86–96https://doi.org/10.1145/3587819.3590974

Published:08 June 2023Publication History

MMSys '23: Proceedings of the 14th ACM Multimedia Systems Conference

Pages 86–96

ABSTRACT

Deep-learning-based video analysis solutions have become indispensable components in today's intelligent sensing applications. In a networked camera system, an efficient way to analyze the captured videos is to extract the features for deep learning at local cameras or edge devices and then transmit the features to powerful processing hubs for further analysis. As there exists substantial redundancy among different feature maps from the same video frame, the feature maps could be compressed before transmission to save bandwidth. This paper introduces a new rate-distortion optimized framework for compressing the intermediate deep features from the key frames of a video. First, to reduce the redundancy among different features, a feature selection strategy is designed based on hierarchical clustering. The selected features are then quantized, repacked as videos, and further compressed using a standardized video encoder. Furthermore, the proposed framework incorporates rate-distortion models that are built for three representative computer vision tasks: image classification, image segmentation, and image retrieval. A corresponding rate-distortion optimization module is designed to enhance the performance of common computer vision tasks under rate constraints. Experimental results show that the proposed deep feature compression framework can boost the compression performance over the standard HEVC video encoder.

References

Zhuo Chen, Kui Fan, Shiqi Wang, Lingyu Duan, Weisi Lin, and Alex Chichung Kot. 2019. Toward Intelligent Sensing: Intermediate Deep Feature Compression. IEEE Transactions on Image Processing 29 (2019), 2230--2243. Google ScholarDigital Library
Zhuo Chen, Kui Fan, Shiqi Wang, Ling-Yu Duan, Weisi Lin, and Alex Kot. 2019. Lossy Intermediate Deep Learning Feature Compression and Evaluation. In Proceedings of the 27th ACM International Conference on Multimedia (Nice, France) (MM '19). Association for Computing Machinery, New York, NY, USA, 2414--2422. Google ScholarDigital Library
Zhuo Chen, Weisi Lin, Shiqi Wang, Lingyu Duan, and Alex C. Kot. 2018. Intermediate Deep Feature Compression: the Next Battlefield of Intelligent Sensing. arXiv:1809.06196 [cs.MM]Google Scholar
Hyomin Choi and Ivan V Bajić. 2018. Deep Feature Compression for Collaborative Object Detection. In 2018 25th IEEE International Conference on Image Processing (ICIP). 3743--3747. Google ScholarCross Ref
Hyomin Choi and Ivan V Bajić. 2018. Near-Lossless Deep Feature Compression for Collaborative Intelligence. In 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP). 1--6. Google ScholarCross Ref
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248--255. Google ScholarCross Ref
Ling-Yu Duan, Yihang Lou, Yan Bai, Tiejun Huang, Wen Gao, Vijay Chandrasekhar, Jie Lin, Shiqi Wang, and Alex Chichung Kot. 2019. Compact Descriptors for Video Analysis: The Emerging MPEG Standard. IEEE MultiMedia 26, 2 (2019), 44--54. Google ScholarCross Ref
Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2010. The pascal visual object classes (voc) challenge. International journal of computer vision 88, 2 (2010), 303--338.Google ScholarDigital Library
Noa Garcia and George Vogiatzis. 2018. Asymmetric Spatio-Temporal Embeddings for Large-Scale Image-to-Video Retrieval. (2018).Google Scholar
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2016. Region-Based Convolutional Networks for Accurate Object Detection and Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 1 (2016), 142--158. Google ScholarDigital Library
Jiuxiang Gu, Jianfei Cai, Gang Wang, and Tsuhan Chen. 2018. Stack-Captioning: Coarse-to-Fine Learning for Image Captioning. arXiv:1709.03376 [cs.CV]Google Scholar
Sulaiman M.N Hossin M. 2015. A Review on Evaluation Metrics for Data Classification Evaluations. International Journal of Data Mining and Knowledge Management Process 5, 2 (2015), 01--11. Google ScholarCross Ref
Ademola Ikusan and Rui Dai. 2021. Rate-Distortion Optimized Hierarchical Deep Feature Compression. In 2021 IEEE International Conference on Multimedia and Expo (ICME). 1--6. Google ScholarCross Ref
Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2008. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search. In Computer Vision - ECCV 2008, David Forsyth, Philip Torr, and Andrew Zisserman (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 304--317.Google ScholarDigital Library
Xuan Jing, Lap-Pui Chau, and Wan-Chi Siu. 2008. Frame Complexity-Based Rate-Quantization Model for H.264/AVC Intraframe Rate Control. IEEE Signal Processing Letters 15 (2008), 373--376. Google ScholarCross Ref
Lingchao Kong and Rui Dai. 2016. Temporal-fluctuation-reduced video encoding for object detection in wireless surveillance systems. In 2016 IEEE International Symposium on Multimedia (ISM). IEEE, 126--132.Google ScholarCross Ref
Lingchao Kong and Rui Dai. 2017. Object-detection-based video compression for wireless surveillance systems. IEEE MultiMedia 24, 2 (2017), 76--85.Google ScholarDigital Library
Lingchao Kong and Rui Dai. 2018. Efficient Video Encoding for Automatic Video Analysis in Distributed Wireless Surveillance Systems. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 14, 3 (2018), 72.Google Scholar
Jie Lin, Ling-Yu Duan, Shiqi Wang, Yan Bai, Yihang Lou, Vijay Chandrasekhar, Tiejun Huang, Alex Kot, and Wen Gao. 2017. HNIP: Compact Deep Invariant Representations for Video Matching, Localization, and Retrieval. IEEE Transactions on Multimedia 19, 9 (2017), 1968--1983. Google ScholarDigital Library
Jiasen Lu, Jianwei Yang, Dhruv Batra, and Devi Parikh. 2017. Hierarchical Question-Image Co-Attention for Visual Question Answering. arXiv:1606.00061 [cs.CV]Google Scholar
Mark Sanderson. 2010. Performance Measures Used in Image Information Retrieval. In ImageCLEF, Experimental Evaluation in Visual Information Retrieval, Henning Müller, Paul D. Clough, Thomas Deselaers, and Barbara Caputo (Eds.). Springer, 81--94. Google ScholarCross Ref
Sinan Saracli, Nurhan Dogan, and Ismet Dogan. 2013. Comparison of hierarchical cluster analysis methods by cophenetic correlation. Journal of Inequalities and Applications 29 (2013), 9. Google ScholarCross Ref
Evan Shelhamer, Jonathan Long, and Trevor Darrell. 2016. Fully Convolutional Networks for Semantic Segmentation. arXiv:1605.06211 [cs.CV]Google Scholar
Ran Shi, King Ngi Ngan, and Songnan Li. 2014. Jaccard index compensation for object segmentation evaluation. 2014 IEEE International Conference on Image Processing (ICIP) (2014), 4457--4461.Google ScholarCross Ref
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556 [cs.CV]Google Scholar
Gary J Sullivan, Jens-Rainer Ohm, Woo-Jin Han, and Thomas Wiegand. 2012. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 1649--1668. Google ScholarDigital Library
Giorgos Tolias, Ronan Sicre, and Hervé Jégou. 2016. Particular object retrieval with integral max-pooling of CNN activations. (2016).Google Scholar
Tsung-Han Tsai and Chung-Yuan Lin. 2011. Exploring contextual redundancy in improving object-based video coding for video sensor networks surveillance. IEEE Transactions on Multimedia 14, 3 (2011), 669--682.Google ScholarDigital Library
Gang Wang, Bo Li, Yongfei Zhang, and Jinhui Yang. 2018. Background modeling and referencing for moving cameras-captured surveillance video coding in HEVC. IEEE Transactions on Multimedia 20, 11 (2018), 2921--2934.Google ScholarDigital Library
Lijun Wang, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu. 2015. Visual Tracking with Fully Convolutional Networks. In 2015 IEEE International Conference on Computer Vision (ICCV). 3119--3127. Google ScholarDigital Library
Dongkuan Xu and Yingjie Tian. 2015. A Comprehensive Survey of Clustering Algorithms. Annals of Data Science 2, 2 (8 2015), 165--193. Google ScholarCross Ref

Index Terms

Deep Feature Compression with Rate-Distortion Optimization for Networked Camera Systems
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Computer graphics
    1. Image compression
2. Information systems
  1. Information systems applications
    1. Multimedia information systems
      1. Multimedia streaming

Recommendations

Efficient CABAC Bit Estimation for H.265/HEVC Rate-Distortion Optimization

The entropy coding of context-adaptive binary arithmetic coding CABAC has been utilized in the H.265/HEVC for higher coding efficiency. But the related complexity also causes a bottleneck for its low-delay applications, owing to the employment of inter-...
Read More
SSIM-based error-resilient rate-distortion optimization of H.264/AVC video coding for wireless streaming

The SSIM-based rate-distortion optimization (RDO) has been verified to be an effective tool for H.264/AVC to promote the perceptual video coding performance. However, the current SSIM-based RDO is not efficient for improving the perceptual quality of ...
Read More
Rate-Distortion Optimized Progressive Geometry Compression
CGIV '06: Proceedings of the International Conference on Computer Graphics, Imaging and Visualisation

During progressive transmission of 3D geometry models, the transmission order of details at different region has great effects on the quality of reconstructed models at low bit-rate. This work presents a ratedistortion (R-D) optimized progressive ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MMSys '23: Proceedings of the 14th ACM Multimedia Systems Conference
June 2023
495 pages
ISBN:9798400701481
DOI:10.1145/3587819
Co-chairs:
Shervin Shirmohammadi,
Mohamed Hefeeda,
Program Co-chairs:
Roger Zimmermann,
Carsten Griwodz,
Mea Wang
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 June 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
compression
rate-distortion optimization
deep learning features
computer vision
edge computing
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate176of530submissions,33%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 91
  Total Downloads
- Downloads (Last 12 months)91
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Deep Feature Compression with Rate-Distortion Optimization for Networked Camera Systems

MMSys '23: Proceedings of the 14th ACM Multimedia Systems Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficient CABAC Bit Estimation for H.265/HEVC Rate-Distortion Optimization

SSIM-based error-resilient rate-distortion optimization of H.264/AVC video coding for wireless streaming

Rate-Distortion Optimized Progressive Geometry Compression

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media