Skip to main content

AdaConfigure: Reinforcement Learning-Based Adaptive Configuration for Video Analytics Services

  • Conference paper
  • First Online:
  • 2087 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13141))

Abstract

The configuration in video analytics defines parameters including frame rate, image resolution, and model selection for video analytics pipeline, and thus determines the inference accuracy and resource consumption. Traditional solutions to select a configuration are either fixed (i.e., the same configuration is used all the time) or periodically adjusted using a brute-force search scheme (i.e., periodically trying different configurations and selecting the one with the best performance), and thus suffer either low inference accuracy or high computation cost to find a proper configuration timely. To this end, we propose a video analytical configuration adaptation framework called AdaConfigure that dynamically selects video configuration without resource-consuming exploration. First, we design a reinforcement learning-based framework in which an agent adaptively chooses the configuration according to the spatial and temporal features of the current video stream. In particular, we use a video segmentation strategy to capture the characteristics of the video stream with much-reduced computation cost: profiling uses only 0.2–2% computation resources as compared to a full video. Second, we design a reward function that considers both the inference accuracy and computation resource consumption so that the configuration achieves good accuracy and resource consumption trade-off. Our evaluation experiments on an object detection task show that our approach outperforms the baseline: it achieves 10–35% higher accuracy with a similar amount of computation resources or achieves similar accuracy with only 10–50% of the computation resources.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)

    Article  Google Scholar 

  2. FFmpeg: Ffmpeg (2000–2018). http://ffmpeg.org/

  3. Ge, W., Yu, Y.: Borrowing treasures from the wealthy: deep transfer learning through selective joint fine-tuning. In: CVPR, pp. 1086–1095 (2017)

    Google Scholar 

  4. Han, S., Shen, H., Philipose, M., Agarwal, S., Wolman, A., Krishnamurthy, A.: Mcdnn: an approximation-based execution framework for deep stream processing under resource constraints. In: MobiSys, pp. 123–136 (2016)

    Google Scholar 

  5. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  6. Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  7. Hsieh, K., et al.: Focus: querying large video datasets with low latency and low cost. In: OSDI, pp. 269–286 (2018)

    Google Scholar 

  8. Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: CVPR, pp. 7310–7311 (2017)

    Google Scholar 

  9. Hung, C.C., et al.: Videoedge: Processing camera streams using hierarchical clusters. In: SEC, pp. 115–131. IEEE (2018)

    Google Scholar 

  10. Jiang, J., Ananthanarayanan, G., Bodik, P., Sen, S., Stoica, I.: Chameleon: scalable adaptation of video analytics. In: SIGCOMM, pp. 253–266 (2018)

    Google Scholar 

  11. Kang, D., Emmons, J., Abuzaid, F., Bailis, P., Zaharia, M.: Noscope: optimizing neural network queries over video at scale. arXiv preprint arXiv:1703.02529 (2017)

  12. Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  13. Mao, H., Netravali, R., Alizadeh, M.: Neural adaptive video streaming with pensieve. In: SIGCOMM, pp. 197–210 (2017)

    Google Scholar 

  14. Mnih, V., et al.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)

  15. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018). https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/languageunsupervised/languageunderstandingpaper.pdf

  16. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)

    Google Scholar 

  17. Romero, F., Li, Q., Yadwadkar, N.J., Kozyrakis, C.: Infaas: a model-less inference serving system. arXiv preprint arXiv:1905.13348 (2019)

  18. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261 (2016)

  19. Ullah, F., Babar, M.A.: Quickadapt: scalable adaptation for big data cyber security analytics. In: ICECCS, pp. 81–86. IEEE (2019)

    Google Scholar 

  20. Wang, C., Zhang, S., Chen, Y., Qian, Z., Wu, J., Xiao, M.: Joint configuration adaptation and bandwidth allocation for edge-based real-time video analytics. In: INFOCOM, pp. 1–10 (2020)

    Google Scholar 

  21. Zhang, H., Ananthanarayanan, G., Bodik, P., Philipose, M., Bahl, P., Freedman, M.J.: Live video analytics at scale with approximation and delay-tolerance. In: NSDI, pp. 377–392 (2017)

    Google Scholar 

Download references

Acknowledgements

This work is supported in part by NSFC (Grant No. 61872215), and Shenzhen Science and Technology Program (Grant No. RCYX20200714114523079). We would like to thank Tencent for sponsoring the research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhi Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

He, Z. et al. (2022). AdaConfigure: Reinforcement Learning-Based Adaptive Configuration for Video Analytics Services. In: Þór Jónsson, B., et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13141. Springer, Cham. https://doi.org/10.1007/978-3-030-98358-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-98358-1_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-98357-4

  • Online ISBN: 978-3-030-98358-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics