AdaConfigure: Reinforcement Learning-Based Adaptive Configuration for Video Analytics Services

He, Zhaoliang; Wang, Yuan; Tang, Chen; Wang, Zhi; Zhu, Wenwu; Guo, Chenyang; Chen, Zhibo

doi:10.1007/978-3-030-98358-1_20

AdaConfigure: Reinforcement Learning-Based Adaptive Configuration for Video Analytics Services

Zhaoliang He^15,18,
Yuan Wang¹⁶,
Chen Tang¹⁷,
Zhi Wang¹⁷,
Wenwu Zhu¹⁵,
Chenyang Guo¹⁹ &
…
Zhibo Chen¹⁹

Conference paper
First Online: 15 March 2022

2087 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13141))

Abstract

The configuration in video analytics defines parameters including frame rate, image resolution, and model selection for video analytics pipeline, and thus determines the inference accuracy and resource consumption. Traditional solutions to select a configuration are either fixed (i.e., the same configuration is used all the time) or periodically adjusted using a brute-force search scheme (i.e., periodically trying different configurations and selecting the one with the best performance), and thus suffer either low inference accuracy or high computation cost to find a proper configuration timely. To this end, we propose a video analytical configuration adaptation framework called AdaConfigure that dynamically selects video configuration without resource-consuming exploration. First, we design a reinforcement learning-based framework in which an agent adaptively chooses the configuration according to the spatial and temporal features of the current video stream. In particular, we use a video segmentation strategy to capture the characteristics of the video stream with much-reduced computation cost: profiling uses only 0.2–2% computation resources as compared to a full video. Second, we design a reward function that considers both the inference accuracy and computation resource consumption so that the configuration achieves good accuracy and resource consumption trade-off. Our evaluation experiments on an object detection task show that our approach outperforms the baseline: it achieves 10–35% higher accuracy with a similar amount of computation resources or achieves similar accuracy with only 10–50% of the computation resources.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Article Google Scholar
FFmpeg: Ffmpeg (2000–2018). http://ffmpeg.org/
Ge, W., Yu, Y.: Borrowing treasures from the wealthy: deep transfer learning through selective joint fine-tuning. In: CVPR, pp. 1086–1095 (2017)
Google Scholar
Han, S., Shen, H., Philipose, M., Agarwal, S., Wolman, A., Krishnamurthy, A.: Mcdnn: an approximation-based execution framework for deep stream processing under resource constraints. In: MobiSys, pp. 123–136 (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Hsieh, K., et al.: Focus: querying large video datasets with low latency and low cost. In: OSDI, pp. 269–286 (2018)
Google Scholar
Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: CVPR, pp. 7310–7311 (2017)
Google Scholar
Hung, C.C., et al.: Videoedge: Processing camera streams using hierarchical clusters. In: SEC, pp. 115–131. IEEE (2018)
Google Scholar
Jiang, J., Ananthanarayanan, G., Bodik, P., Sen, S., Stoica, I.: Chameleon: scalable adaptation of video analytics. In: SIGCOMM, pp. 253–266 (2018)
Google Scholar
Kang, D., Emmons, J., Abuzaid, F., Bailis, P., Zaharia, M.: Noscope: optimizing neural network queries over video at scale. arXiv preprint arXiv:1703.02529 (2017)
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Mao, H., Netravali, R., Alizadeh, M.: Neural adaptive video streaming with pensieve. In: SIGCOMM, pp. 197–210 (2017)
Google Scholar
Mnih, V., et al.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018). https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/languageunsupervised/languageunderstandingpaper.pdf
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
Google Scholar
Romero, F., Li, Q., Yadwadkar, N.J., Kozyrakis, C.: Infaas: a model-less inference serving system. arXiv preprint arXiv:1905.13348 (2019)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261 (2016)
Ullah, F., Babar, M.A.: Quickadapt: scalable adaptation for big data cyber security analytics. In: ICECCS, pp. 81–86. IEEE (2019)
Google Scholar
Wang, C., Zhang, S., Chen, Y., Qian, Z., Wu, J., Xiao, M.: Joint configuration adaptation and bandwidth allocation for edge-based real-time video analytics. In: INFOCOM, pp. 1–10 (2020)
Google Scholar
Zhang, H., Ananthanarayanan, G., Bodik, P., Philipose, M., Bahl, P., Freedman, M.J.: Live video analytics at scale with approximation and delay-tolerance. In: NSDI, pp. 377–392 (2017)
Google Scholar

Download references

Acknowledgements

This work is supported in part by NSFC (Grant No. 61872215), and Shenzhen Science and Technology Program (Grant No. RCYX20200714114523079). We would like to thank Tencent for sponsoring the research.

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tsinghua University, Beijing, China
Zhaoliang He & Wenwu Zhu
Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen, China
Yuan Wang
Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
Chen Tang & Zhi Wang
Peng Cheng Laboratory, Shenzhen, China
Zhaoliang He
Tencent Youtu Lab, Shanghai, China
Chenyang Guo & Zhibo Chen

Authors

Zhaoliang He
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chen Tang
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wenwu Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Chenyang Guo
View author publications
You can also search for this author in PubMed Google Scholar
Zhibo Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhi Wang .

Editor information

Editors and Affiliations

IT University of Copenhagen, Copenhagen, Denmark
Björn Þór Jónsson
Dublin City University, Dublin, Ireland
Cathal Gurrin
University of Science, VNU-HCM, Ho Chi Minh City, Vietnam
Minh-Triet Tran
University of Bergen, Bergen, Norway
Duc-Tien Dang-Nguyen
National Tsing Hua University, Hsinchu, Taiwan
Anita Min-Chun Hu
Hanoi University of Science and Technology, Hanoi, Vietnam
Binh Huynh Thi Thanh
Median Technologies, Valbonne, France
Benoit Huet

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, Z. et al. (2022). AdaConfigure: Reinforcement Learning-Based Adaptive Configuration for Video Analytics Services. In: Þór Jónsson, B., et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13141. Springer, Cham. https://doi.org/10.1007/978-3-030-98358-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-98358-1_20
Published: 15 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98357-4
Online ISBN: 978-3-030-98358-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics