skip to main content
10.1145/3583740.3626630acmconferencesArticle/Chapter ViewAbstractPublication PagessecConference Proceedingsconference-collections
poster

Poster: Efficient Video Instance Segmentation with Early Exit at the Edge

Published: 07 August 2024 Publication History

Abstract

Video instance segmentation has emerged as a critical component in enabling connected vehicles to comprehend complex driving scenes, thereby facilitating navigation under various driving conditions. Recent advances focus on video-based solutions, which leverage temporal and spatial information to achieve superior performance compared to the traditional image-based approaches. However, these video-based solutions present challenges for efficient deployment at the edge due to their high computational and memory demands, making them inefficient for deployment on edge devices, such as intelligent vehicles. Furthermore, the large size of video data makes it impractical to upload to cloud servers. To address the latency challenge during on-device inference, we propose to incorporate early exits into the model. While the early exit strategy has been successful in image classification and natural language processing tasks, our study is the first to explore its application in video instance segmentation. Specifically, we incorporate early exits into the transformer-based video instance segmentation model, VisTR. Our experimental results on the YouTube-VIS dataset demonstrate that early exit can significantly speed up the inference by up to 4.83× with a minimal trade-off of only 3% in the averaged precision scores. Furthermore, our qualitative analysis confirms the satisfactory quality of the generated segmentation masks.

References

[1]
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-end Object Detection with Transformers. In European Conference on Computer Vision. Springer, 213--229.
[2]
Jonathan Frankle and Michael Carbin. 2018. The Lottery Ticket Hypothesis: Finding sparse, Trainable Neural Networks. arXiv preprint arXiv:1803.03635 (2018).
[3]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Munich, Germany, 770--778.
[4]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. arXiv preprint arXiv:1503.02531 (2015).
[5]
Chenxi Liu, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Wei Hua, Alan L Yuille, and Li Fei-Fei. 2019. Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. California, United States, 82--92.
[6]
Sachin Mehta, Mohammad Rastegari, Anat Caspi, Linda Shapiro, and Hannaneh Hajishirzi. 2018. ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation. In Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany, 552--568.
[7]
Paul Michel, Omer Levy, and Graham Neubig. 2019. Are Sixteen Heads Really Better Than One? Proceedings of the Advances in Neural Information Processing Systems 32 (Dec 2019).
[8]
Vladimir Nekrasov, Hao Chen, Chunhua Shen, and Ian Reid. 2019. Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. California, United States, 9126--9135.
[9]
Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. 2016. BranchyNet: Fast inference via early exiting from deep neural networks. In Proceedings of the 23rd International Conference on Pattern recognition (ICPR). Montreal, Canada, 2464--2469.
[10]
Yuqing Wang, Zhaoliang Xu, Xinlong Wang, Chunhua Shen, Baoshan Cheng, Hao Shen, and Huaxia Xia. 2021. End-to-end Video Instance Segmentation with Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, United States, 8741--8750.
[11]
Linjie Yang, Yuchen Fan, and Ning Xu. 2019. Video Instance Segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, Korea, 5188--5197.
[12]
Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, and Jiaya Jia. 2018. ICNet for Real-time Semantic Segmentation on High-resolution Images. In Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany, 405--420.
[13]
Kaiqi Zhao, Yitao Chen, and Ming Zhao. 2023. A Contrastive Knowledge Transfer Framework for Model Compression and Transfer Learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Rhodes Island, Greece, 1--5.
[14]
Kaiqi Zhao, Animesh Jain, and Ming Zhao. 2023. Automatic Attention Pruning: Improving and Automating Model Pruning using Attentions. In Proceedings of the International Conference on Artificial Intelligence and Statistics. PMLR, Valencia, Spain, 10470--10486.
[15]
K Zhao, HD Nguyen, A Jain, N Susanj, A Mouchtaris, L Gupta, and M Zhao. 2022. Knowledge Distillation via Module Replacing for Automatic Speech Recognition with Recurrent Neural Network Transducer. In Proceedings of the 23rd Interspeech Conference. Incheon, Korea.
[16]
Wangchunshu Zhou, Canwen Xu, Tao Ge, Julian McAuley, Ke Xu, and Furu Wei. 2020. Bert Loses Patience: Fast and Robust Inference with Early Exit. Proceedings of the 34th Advances in Neural Information Processing Systems 33 (Dec 2020), 18330--18341.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SEC '23: Proceedings of the Eighth ACM/IEEE Symposium on Edge Computing
December 2023
405 pages
ISBN:9798400701238
DOI:10.1145/3583740
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 August 2024

Check for updates

Qualifiers

  • Poster

Funding Sources

  • CNS
  • OAC
  • SES

Conference

SEC '23
Sponsor:
SEC '23: Eighth ACM/IEEE Symposium on Edge Computing
December 6 - 9, 2023
DE, Wilmington, USA

Acceptance Rates

Overall Acceptance Rate 40 of 100 submissions, 40%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 30
    Total Downloads
  • Downloads (Last 12 months)30
  • Downloads (Last 6 weeks)4
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media