skip to main content
10.1145/3459637.3482335acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Dynamic Early Exit Scheduling for Deep Neural Network Inference through Contextual Bandits

Published: 30 October 2021 Publication History

Abstract

Recent advances in Deep Neural Networks (DNNs) have dramatically improved the accuracy of DNN inference, but also introduce larger latency. In this paper, we investigate how to utilize early exit, a novel method that allows inference to exit at earlier exit points at the cost of an acceptable amount of accuracy. Scheduling the optimal exit point on a per-instance basis is challenging because the realized performance (i.e., confidence and latency) of each exit point is random and the statistics vary in different scenarios. Moreover, the performance has dependencies among the exit points, further complicating the problem. Therefore, the optimal exit scheduling decision cannot be known in advance but should be learned in an online fashion. To this end, we propose Dynamic Early Exit (DEE), a real-time online learning algorithm based on contextual bandit analysis. DEE observes the performance at each exit point as context and decides whether to exit or keep processing. Unlike standard contextual bandit analyses, the rewards of the decisions in our problem are temporally dependent. Furthermore, the performances of the earlier exit points are inevitably explored more compared to the later ones, which poses an unbalance exploration-exploitation trade-off. DEE addresses the aforementioned challenges, where its regret per inference asymptotically approaches zero. We compare DEE with four benchmark schemes in the real-world experiment. The experiment result shows that DEE can improve the overall performance by up to 98.1% compared to the best benchmark scheme.

Supplementary Material

MP4 File (CIKM21-rgfp0678.mp4)
Presentation video

References

[1]
[n.d.]. DEE Appendix Technical appendix. https://www.dropbox.com/s/h11r2de9jtqkaog/DEE_CIKM_Appendix.pdf?dl=0. Accessed: 2021-05--26.
[2]
Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford, Lihong Li, and Robert Schapire. 2014. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits. In ICML. Beijing, China, 1638--1646.
[3]
Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. 2002. Finite-Time Analysis of the Multiarmed Bandit Problem. Mach. Learn. 47, 2--3 (May 2002), 235--256.
[4]
Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. 2003. The Nonstochastic Multiarmed Bandit Problem. SIAM J. Comput. 32, 1 (Jan. 2003), 48--77.
[5]
Emmanuel Bengio, Pierre-Luc Bacon, Joelle Pineau, and Doina Precup. 2015. Conditional computation in neural networks for faster models. arXiv preprint arXiv:1511.06297 (2015).
[6]
Konstantin Berestizshevsky and Guy Even. 2019. Dynamically Sacrificing Accuracy for Reduced Computation: Cascaded Inference Based on Softmax Confidence. In ICANN. Springer, Munich, Germany, 306--320.
[7]
Tolga Bolukbasi, JosephWang, Ofer Dekel, and Venkatesh Saligrama. 2017. Adaptive Neural Networks for Efficient Inference. In ICML (PMLR, Vol. 70). PMLR, Sydney, Australia, 527--536.
[8]
Margaux Brégère, Pierre Gaillard, Yannig Goude, and Gilles Stoltz. 2019. Target Tracking for Contextual Bandits: Application to Demand Side Management. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, Long Beach, California, USA, 754--763.
[9]
Richard Combes, Alexandre Proutière, and Alexandre Fauquette. 2020. Unimodal Bandits with Continuous Arms: Order-optimal Regret without Smoothness. POMACS 4, 1 (2020), 1--28.
[10]
Xin Dai, Xiangnan Kong, and Tian Guo. 2020. EPNet: Learning to Exit with Flexible Multi-Branch Network (CIKM '20). Association for Computing Machinery, New York, NY, USA, 235--244. https://doi.org/10.1145/3340531.3411973
[11]
Yuan Deng, Sébastien Lahaie, and Vahab Mirrokni. 2019. A Robust Non- Clairvoyant Dynamic Mechanism for Contextual Auctions. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 8657--8667.
[12]
Maria Dimakopoulou, Zhengyuan Zhou, Susan Athey, and Guido Imbens. 2019. Balanced Linear Contextual Bandits. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. Honolulu, HI, USA, 3445--3453.
[13]
Miroslav Dudik, Daniel Hsu, Satyen Kale, Nikos Karampatziakis, John Langford, Lev Reyzin, and Tong Zhang. 2011. Efficient Optimal Learning for Contextual Bandits. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (Barcelona, Spain) (UAI'11). AUAI Press, Arlington, Virginia, USA, 169--178.
[14]
Michael Figurnov, Maxwell D Collins, Yukun Zhu, Li Zhang, Jonathan Huang, Dmitry Vetrov, and Ruslan Salakhutdinov. 2017. Spatially adaptive computation time for residual networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA, 1039--1048.
[15]
Dylan J Foster, Akshay Krishnamurthy, and Haipeng Luo. 2019. Model Selection for Contextual Bandits. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 14741--14752.
[16]
Krizhevsky A. Hinton G. 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report. Computer Science Department, University of Toronto.
[17]
Negin Golrezaei, Adel Javanmard, and Vahab Mirrokni. 2019. Dynamic Incentive- Aware Learning: Robust Pricing in Contextual Auctions. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 9759--9769.
[18]
Yunchao Gong, Liu Liu, Ming Yang, and Lubomir Bourdev. 2014. Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115 (2014).
[19]
Shupeng Gui, Haotao N Wang, Haichuan Yang, Chen Yu, Zhangyang Wang, and Ji Liu. 2019. Model Compression with Adversarial Robustness: A Unified Optimization Framework. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 1285--1296.
[20]
Elad Hazan and Nimrod Megiddo. 2007. Online Learning with Prior Knowledge. In COLT. Springer, Berlin, Heidelberg, 499--513.
[21]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR. Las Vegas, NV, USA, 770--778.
[22]
Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, and Kilian Q Weinberger. 2017. Multi-scale Dense Networks for Resource Efficient Image Classification. International Conference on Learning Representations (2017).
[23]
Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. 2017. Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge. In ASPLOS (Xi'an, China) (ASPLOS '17). 615--629.
[24]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS - Volume 1 (Lake Tahoe, Nevada) (NIPS'12). Curran Associates Inc., 1097--1105.
[25]
E. Li, L. Zeng, Z. Zhou, and X. Chen. 2020. Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing. IEEE Transactions on Wireless Communications 19, 1 (2020), 447--457.
[26]
F. Li, J. Liu, and B. Ji. 2019. Combinatorial Sleeping Bandits with Fairness Constraints. MathSciNet (2019), 1--1.
[27]
Baoyuan Liu, MinWang, Hassan Foroosh, Marshall Tappen, and Marianna Pensky. 2015. Sparse Convolutional Neural Networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, MA, USA.
[28]
Virag Shah, Ramesh Johari, and Jose Blanchet. 2019. Semi-Parametric Dynamic Contextual Pricing. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 2363--2373.
[29]
Aleksandrs Slivkins. 2011. Contextual Bandits with Similarity Information. In COLT. Budapest, Hungary, 679--702.
[30]
Aleksandrs Slivkins. 2019. Introduction to multi-armed bandits. arXiv preprint arXiv:1904.07272 (2019).
[31]
Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. 2016. Branchynet: Fast Inference via Early Exiting from Deep Neural Networks. In ICPR. Amsterdam, the Netherlands, 2464--2469.
[32]
M.Wang, J. Mo, J. Lin, Z.Wang, and L. Du. 2019. DynExit: A Dynamic Early-Exit Strategy for Deep Residual Networks. In SiPS 2019. 178--183.
[33]
X. Wang, Y. Han, V. C. M. Leung, D. Niyato, X. Yan, and X. Chen. 2020. Convergence of Edge Computing and Deep Learning: A Comprehensive Survey. IEEE Commun. Surv. Tutor. 22, 2 (2020), 869--904.
[34]
Xin Wang, Fisher Yu, Zi-Yi Dou, Trevor Darrell, and Joseph E Gonzalez. 2018. Skipnet: Learning Dynamic Routing in Convolutional Networks. In Proceedings of the European Conference on Computer Vision (ECCV). Munich, Germany, 409--424.
[35]
Nirandika Wanigasekara and Christina Yu. 2019. Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 14684--14694.
[36]
Zuxuan Wu, Caiming Xiong, Yu-Gang Jiang, and Larry S Davis. 2019. LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 7780--7789.
[37]
Chicheng Zhang, Alekh Agarwal, Hal Daumé Iii, John Langford, and Sahand Negahban. 2019. Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback. In ICML (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, Long Beach, California, USA, 7335--7344.
[38]
Zhengyuan Zhou, Renyuan Xu, and Jose Blanchet. 2019. Learning in Generalized Linear Contextual Bandits with Stochastic Delays. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 5197--5208.

Cited By

View all
  • (2024)Early-Exit Deep Neural Network - A Comprehensive SurveyACM Computing Surveys10.1145/369876757:3(1-37)Online publication date: 22-Nov-2024
  • (2024)Edge Intelligence for Internet of Vehicles: A SurveyIEEE Transactions on Consumer Electronics10.1109/TCE.2024.337850970:2(4858-4877)Online publication date: May-2024
  • (2024)ClassyNet: Class-Aware Early-Exit Neural Networks for Edge DevicesIEEE Internet of Things Journal10.1109/JIOT.2023.334412011:9(15113-15127)Online publication date: 1-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. contextual bandits
  2. dnn
  3. dnn inference
  4. early exit

Qualifiers

  • Research-article

Conference

CIKM '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)117
  • Downloads (Last 6 weeks)8
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Early-Exit Deep Neural Network - A Comprehensive SurveyACM Computing Surveys10.1145/369876757:3(1-37)Online publication date: 22-Nov-2024
  • (2024)Edge Intelligence for Internet of Vehicles: A SurveyIEEE Transactions on Consumer Electronics10.1109/TCE.2024.337850970:2(4858-4877)Online publication date: May-2024
  • (2024)ClassyNet: Class-Aware Early-Exit Neural Networks for Edge DevicesIEEE Internet of Things Journal10.1109/JIOT.2023.334412011:9(15113-15127)Online publication date: 1-May-2024
  • (2024)I-SplitEE: Image Classification in Split Computing DNNs with Early ExitsICC 2024 - IEEE International Conference on Communications10.1109/ICC51166.2024.10622954(2658-2663)Online publication date: 9-Jun-2024
  • (2023)SplitEE: Early Exit in Deep Neural Networks with Split ComputingProceedings of the Third International Conference on AI-ML Systems10.1145/3639856.3639873(1-9)Online publication date: 25-Oct-2023
  • (2023)Prediction Privacy in Distributed Multi-Exit Neural Networks: Vulnerabilities and SolutionsProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623069(1123-1137)Online publication date: 15-Nov-2023
  • (2023)Collaborative Inference Acceleration Integrating DNN Partitioning and Task Offloading in Mobile Edge ComputingInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402341008533:11n12(1835-1863)Online publication date: 29-Nov-2023
  • (2023)Multi-Exit DNN Inference Acceleration Based on Multi-Dimensional Optimization for Edge IntelligenceIEEE Transactions on Mobile Computing10.1109/TMC.2022.317240222:9(5389-5405)Online publication date: 1-Sep-2023
  • (2023)AdaEE: Adaptive Early-Exit DNN Inference Through Multi-Armed BanditsICC 2023 - IEEE International Conference on Communications10.1109/ICC45041.2023.10279243(3726-3731)Online publication date: 28-May-2023
  • (2023)Edge-Centric Optimization of Multi-modal ML-Driven eHealth ApplicationsEmbedded Machine Learning for Cyber-Physical, IoT, and Edge Computing10.1007/978-3-031-40677-5_5(95-125)Online publication date: 7-Oct-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media