research-article

ENTRO: Tackling the Encoding and Networking Trade-off in Offloaded Video Analytics

Authors:

Kyunghan LeeAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 9115 - 9123

https://doi.org/10.1145/3581783.3613785

Published: 27 October 2023 Publication History

Abstract

With the rapid advances of deep learning and the commercialization of high-definition cameras in mobile and embedded devices, the demands from latency-critical applications such as AR and XR for high-quality video analytics (HVA) are soaring. By the nature of HVA aiming at enabling detailed analytics even for small objects, its on-device implementation is suffering from thermal and battery issues, which makes offloaded HVA an attractive solution. This work provides unique observations on the tradeoff pertaining to offloaded HVA: the frame encoding time, the frame transmission time, and the HVA accuracy. Our observations pose a fundamental question: given a latency budget, how to choose the encoding option that properly combines between the encoding time and the transmission time to maximize the HVA accuracy. To answer this question, we propose an offloaded HVA system, ENTRO, which exploits this tradeoff in real-time to maximize the HVA accuracy under the latency budget. Our extensive evaluations with ENTRO implemented on Nvidia AGX Xavier and Samsung Galaxy S20 Ultra over WiFi networks show 8.8× improvement in latency without accuracy loss compared to DDS, the state-of-the-art offloaded video analytics. Our evaluation over commercial 5G and LTE networks also indicates that ENTRO flexibly adapts its encoding option under the tradeoff and enables the latency-bounded HVA with 4K frames.

References

[1]

California traffic 4k free video. https://pixabay.com/videos/los-angeles-trafficcalifornia-road-53125/.

[2]

Drone 4k free video. https://drive.google.com/file/d/1PjB4UmHkN3kbypduRjfaI8VmOxzQKco/ view?usp=sharing.

[3]

Great lakes beach in downtown chicago 4k free video. https://www.vecteezy. com/video/1615007-great-lakes-beach-in-downtown-chicago-4k.

[4]

Oxford street in london, england 4k free video. https://www.videezy.com/travel/ 4984-crowds-and-shoppers-on-oxford-street-in-london-england-4k.

[5]

Road traffic 4k free video. https://www.youtube.com/watch?v=MNn9qKG2UFI.

[6]

Artacho, B., and Savakis, A. Unipose: Unified human pose estimation in single images and videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 7035--7044.

[7]

Borji, A., Cheng, M.-M., Hou, Q., Jiang, H., and Li, J. Salient object detection: A survey. Computational visual media 5, 2 (2019), 117--150.

[8]

Chen, T. Y.-H., Ravindranath, L., Deng, S., Bahl, P., and Balakrishnan, H. Glimpse: Continuous, real-time object recognition on mobile devices. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems (2015), pp. 155--168.

Digital Library

[9]

Corporation, N. Nvjpeg gpu-accelerated jpeg decoder, encoder and transcoder. https://developer.nvidia.com/nvjpeg, 2018.

[10]

Corporation., N. Jetson agx xavier developer kit. https://developer.nvidia.com/embedded/jetson-agx-xavier-developer-kit, 2020.

[11]

Du, K., Pervaiz, A., Yuan, X., Chowdhery, A., Zhang, Q., Hoffmann, H., and Jiang, J. Server-driven video streaming for deep learning inference. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication (2020), pp. 557--570.

Digital Library

[12]

Guan, Y., Zheng, C., Zhang, X., Guo, Z., and Jiang, J. Pano: Optimizing 360 video streaming with a better understanding of quality perception. In Proceedings of the ACM Special Interest Group on Data Communication. 2019, pp. 394--407.

Digital Library

[13]

Hanyao, M., Jin, Y., Qian, Z., Zhang, S., and Lu, S. Edge-assisted online ondevice object detection for real-time video analytics. In IEEE INFOCOM 2021-IEEE Conference on Computer Communications (2021), IEEE, pp. 1--10.

Digital Library

[14]

Huang, T., Zhang, R.-X., Zhou, C., and Sun, L. Qarc: Video quality aware rate control for real-time video streaming based on deep reinforcement learning. In Proceedings of the 26th ACM international conference on Multimedia (2018), pp. 1208--1216.

Digital Library

[15]

Hubert, B. Linux traffic control (tc). https://manpages.ubuntu.com/manpages/xenial/man8/tc.8.html.

[16]

Itseez. Open source computer vision library. https://github.com/itseez/opencv, 2015.

[17]

Jiang, J., Ananthanarayanan, G., Bodik, P., Sen, S., and Stoica, I. Chameleon: scalable adaptation of video analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (2018), pp. 253--266.

Digital Library

[18]

Jiang, S., Lin, Z., Li, Y., Shu, Y., and Liu, Y. Flexible high-resolution object detection on edge devices with tunable latency. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking (2021), pp. 559--572.

Digital Library

[19]

Kim, S., Bin, K., Ha, S., Lee, K., and Chong, S. ztt: learning-based dvfs with zero thermal throttling for mobile devices. In Proceedings of the 19th Annual International Conference on Mobile Systems, Applications, and Services (2021), pp. 41--53.

Digital Library

[20]

KuntaiDu. Dds repository. https://github.com/KuntaiDu/dds, 2020.

[21]

Lee, J., Lee, S., Lee, J., Sathyanarayana, S. D., Lim, H., Lee, J., Zhu, X., Ramakrishnan, S., Grunwald, D., Lee, K., et al. Perceive: Deep learning-based cellular uplink prediction using real-time scheduling patterns. In Proceedings of the 18th International Conference on Mobile Systems, Applications, and Services (2020), pp. 377--390.

Digital Library

[22]

Li, Y., Padmanabhan, A., Zhao, P., Wang, Y., Xu, G. H., and Netravali, R. Reducto: On-camera filtering for resource-efficient real-time video analytics. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication (2020), pp. 359--376.

Digital Library

[23]

Liu, L., Li, H., and Gruteser, M. Edge assisted real-time object detection for mobile augmented reality. In The 25th Annual International Conference on Mobile Computing and Networking (2019), pp. 1--16.

Digital Library

[24]

Liu, Z., Gao, G., Sun, L., and Fang, Z. Hrdnet: high-resolution detection network for small objects. In 2021 IEEE International Conference on Multimedia and Expo (ICME) (2021), IEEE, pp. 1--6.

[25]

Narayanan, A., Zhang, X., Zhu, R., Hassan, A., Jin, S., Zhu, X., Zhang, X., Rybkin, D., Yang, Z., Mao, Z. M., et al. A variegated look at 5g in the wild: performance, power, and qoe implications. In Proceedings of the 2021 ACM SIGCOMM 2021 Conference (2021), pp. 610--625.

Digital Library

[26]

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds. Curran Associates, Inc., 2019, pp. 8024--8035.

[27]

Ran, X., Chen, H., Zhu, X., Liu, Z., and Chen, J. Deepdecision: A mobile deep learning framework for edge video analytics. In IEEE INFOCOM 2018-IEEE Conference on Computer Communications (2018), IEEE, pp. 1421--1429.

Digital Library

[28]

Samsung. Samsung galaxy s20 ultra 5g. https://www.samsung.com/us/mobile/galaxy-s20-5g/specs/, 2020.

[29]

Tan, M., Pang, R., and Le, Q. V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2020), pp. 10781--10790.

[30]

Ultralytics. Yolov5. https://https://github.com/ultralytics/yolov5, 2020.

[31]

Vakili, A., and Gregoire, J.-C. Accurate one-way delay estimation: Limitations and improvements. IEEE Transactions on Instrumentation and Measurement 61, 9 (2012), 2428--2435.

[32]

Wang, C.-Y., Bochkovskiy, A., and Liao, H.-Y. M. Scaled-yolov4: Scaling cross stage partial network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 13029--13038.

[33]

Wang, X., Yang, Z., Wu, J., Zhao, Y., and Zhou, Z. Edgeduet: Tiling small object detection for edge assisted autonomous mobile vision. In IEEE INFOCOM 2021-IEEE Conference on Computer Communications (2021), IEEE, pp. 1--10.

Digital Library

[34]

Xia, G.-S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., Datcu, M., Pelillo, M., and Zhang, L. Dota: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE conference on computer vision and pattern recognition (2018), pp. 3974--3983.

[35]

Xu, D., Zhou, A., Zhang, X., Wang, G., Liu, X., An, C., Shi, Y., Liu, L., and Ma, H. Understanding operational 5g: A first measurement study on its coverage, performance and energy consumption. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication (2020), pp. 479--494.

Digital Library

[36]

Xu, M., Xu, T., Liu, Y., and Lin, F. X. Video analytics with zero-streaming cameras. In 2021 USENIX Annual Technical Conference (USENIX ATC 21) (2021), pp. 459--472.

[37]

Yan, M., Zhao, M., Xu, Z., Zhang, Q., Wang, G., and Su, Z. Vargfacenet: An efficient variable group convolutional neural network for lightweight face recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019), pp. 0--0.

[38]

Yuan, Y., Chen, X., andWang, J. Object-contextual representations for semantic segmentation. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part VI 16 (2020), Springer, pp. 173--190.

[39]

Zhang, B., Jin, X., Ratnasamy, S., Wawrzynek, J., and Lee, E. A. Awstream: Adaptive wide-area streaming analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (2018), pp. 236--252.

Digital Library

[40]

Zhang, W., He, Z., Liu, L., Jia, Z., Liu, Y., Gruteser, M., Raychaudhuri, D., and Zhang, Y. Elf: accelerate high-resolution mobile deep vision with content-aware parallel offloading. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking (2021), pp. 201--214.

Digital Library

Cited By

Ghasemi MKostic ZGhaderi JZussman GGanesan DLane NShi W(2024)EdgeCloudAI: Edge-Cloud Distributed Video AnalyticsProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3698857(1778-1780)Online publication date: 4-Dec-2024
https://dl.acm.org/doi/10.1145/3636534.3698857

Index Terms

ENTRO: Tackling the Encoding and Networking Trade-off in Offloaded Video Analytics
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
2. Information systems
  1. Information systems applications
    1. Mobile information processing systems

Recommendations

Content-aware error-resilient transcoding using prioritized intra-refresh for video streaming

Transmitting video data over wireless networks can be very unreliable due to packet-loss, leading to serious video quality degradation which is annoying to human perception. The lost packets not only affect the quality of current frame, but also lead to ...
SSIM-based error-resilient rate-distortion optimization of H.264/AVC video coding for wireless streaming

The SSIM-based rate-distortion optimization (RDO) has been verified to be an effective tool for H.264/AVC to promote the perceptual video coding performance. However, the current SSIM-based RDO is not efficient for improving the perceptual quality of ...
Low bit rates video coding using hybrid frame resolutions

This paper proposes a video coding scheme for low bit rates environments such as video streaming in internet or mobile networks. Since the annoying artifacts at low bit rates result mostly from the coarse quantization of residual signals, we try to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

IITP grant (2021-0-02094) funded by the Korea government (MSIT)
IITP grant (2022-0-00420) funded by the Korea government (MSIT)
National Research Foundation of Korea (NRF) Grant through the Ministry of Science and ICT (MSIT), Korea Government, under Grant 2022R1A5A1027646
Cisco Systems (Grant 1368170)

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
348
Total Downloads

Downloads (Last 12 months)196
Downloads (Last 6 weeks)16

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ghasemi MKostic ZGhaderi JZussman GGanesan DLane NShi W(2024)EdgeCloudAI: Edge-Cloud Distributed Video AnalyticsProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3698857(1778-1780)Online publication date: 4-Dec-2024
https://dl.acm.org/doi/10.1145/3636534.3698857

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten