skip to main content
10.1145/3460426.3463660acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Embedded YOLO: Faster and Lighter Object Detection

Published: 01 September 2021 Publication History

Abstract

Object detection is a fundamental but very important task in computer vision. Most current algorithms require high computing resources, which hinders their deployment on embedded system. In this research, we propose a neural network model named Embedded YOLO to solve this problem. We propose the DSC_CSP module to replace the middle layers of YOLOv5s to reduce the number of model parameters. On the other hand, in order to avoid the decrease of performance due to the reduction of parameters, we utilize knowledge distillation to maintain performance. To make good use of the information provided by data augmentation, we propose a new method called Dynamic Interpolation Mosaic to improve the original Mosaic. Due to serious imbalance in the number of samples of different data types, we employ a two-stage training scheme to overcome the data imbalance problem. The proposed model achieved the best results in the ICMR2021 Grand Challenge PAIR Competition with 0.59 mAP and model size of 12MB and 41 FPS on the MediaTek's Dimensity 1000 platform. These results confirm that the proposed model is suitable for deployment in embedded systems for object detection task.

References

[1]
J. R. Uijlings, K. E. van de Sande, T. Gevers, and A. W. Smeulders. Selective search for object recognition. IJCV, 2013.
[2]
D. Huang, C. Shan, M. Ardabilian, Y. Wang, L. Chen Local binary patterns and its application to facial image analysis: a survey IEEE Trans. Syst. Man Cybern.-Part C: Appl. Rev., 41 (6) (2011), pp. 765--781
[3]
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In:CVPR (2005).
[4]
R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
[5]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," in Advances in neural information processing systems, 2015, pp. 91--99.
[6]
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C.Berg, "Ssd: Single shot multibox detector," in European conference on computer vision. Springer, 2016, pp. 21--37.
[7]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi, "You only look once: Unified, real-time object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779--788.
[8]
T.Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, S. Belongie Feature pyramid networks for object detection Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017), pp. 2117--2125.
[9]
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks. CVPR, 2018.
[10]
Zicong Jiang, Liquan Zhao, Shuaiyang Li, Yanfei Jia (2020). Real-time object detection method based on improved YOLOv4-tiny. arXiv preprint arXiv:2011.04244
[11]
Jocher, G., Nishimura, K., Mineeva, T., Vilariño, R.: YOLOv5 (2020). https://github.com/ultralytics/yolov5. Accessed 25 May 2021
[12]
Mingxing Tan, Ruoming Pang, and Quoc V Le. EfficientDet: Scalable and efficient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. 2, 4, 13
[13]
S. Pan and Q. Yang. A survey on transfer learning. Knowledge and Data Engineering, IEEE Transactions on, 22(10):1345--1359, 2010. 2.
[14]
A. A. Rusu, N. C. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K. Kavukcuoglu, R. Pascanu, and R. Hadsell. Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016. 5.
[15]
Eric Tzeng, Judy Hoffman, Ning Zhang, Kate Saenko, and Trevor Darrell. Deep domain confusion: Maximizing for domain invariance. arXiv preprint arXiv:1412.3474, 2014.
[16]
G. Hinton, O. Vinyals, and J. Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015. 2, 7
[17]
J Yu, H Xie, M Li, G Xie, Y Yu, CW Chen (2019, July) MOBILE CENTERNET FOR EMBEDDED DEEP LEARNING OBJECT DETECTION. In: IEEE international conference on multimedia and expo (ICME).
[18]
Fisher Yu, Wenqi Xian, Yingying Chen, Fangchen Liu, Mike Liao, Vashisht Madhavan, and Trevor Darrell, "Bdd100K: A diverse driving video database with scalable annotation tooling,"
[19]
C.-E. Wu, Y.-M. Chan, C.-H. Chen, W.-C. Chen, and C.-S. Chen,"IMMVP: An efficient daytime and nighttime on-road object detector,"in Proc. IEEE 21st Int. Workshop Multimedia Signal Process. (MMSP), Sep. 2019, pp. 1--5.
[20]
Z. Wang et al., "Efficient YOLO: A lightweight model for embedded deep learning object detection," in Proc. IEEE Int. Conf. Multimedia Expo. Workshops., 2020, pp. 1--6.
[21]
J. Redmon and A. Farhadi. "Yolov3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018.
[22]
Chien-Yao Wang, Hong-Yuan Mark Liao, Yueh-Hua Wu, Ping-Yang Chen, Jun-Wei Hsieh, and I-Hau Yeh. CSPNet: A new backbone that can enhance learning capability of cnn. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPR Workshop), 2020. 2, 7
[23]
Zagoruyko, S., Komodakis, N.: Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In: ICLR. (2017) 6.
[24]
T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar. Focal loss for dense object detection. arXiv preprint arXiv:1708.02002, 2017. 1, 3, 4.

Cited By

View all
  • (2024)A Low-Cost Deep-Learning-Based System for Grading Cashew NutsComputers10.3390/computers1303007113:3(71)Online publication date: 8-Mar-2024
  • (2024)High-Resolution Eye-Tracking System for Accurate Measurement of Short-Latency Ocular Following Responses: Development and Observational StudyJMIR Pediatrics and Parenting10.2196/643537(e64353-e64353)Online publication date: 9-Dec-2024
  • (2024)Object Detection Algorithms in Embedded Systems and their ApplicationsFrontier Computing on Industrial Applications Volume 110.1007/978-981-99-9299-7_27(194-200)Online publication date: 21-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '21: Proceedings of the 2021 International Conference on Multimedia Retrieval
August 2021
715 pages
ISBN:9781450384636
DOI:10.1145/3460426
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cspnet
  2. embedded system
  3. knowledge distillation
  4. lightweight model
  5. object detection
  6. yolo

Qualifiers

  • Research-article

Funding Sources

  • Ministry of Science and Technology

Conference

ICMR '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)109
  • Downloads (Last 6 weeks)5
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Low-Cost Deep-Learning-Based System for Grading Cashew NutsComputers10.3390/computers1303007113:3(71)Online publication date: 8-Mar-2024
  • (2024)High-Resolution Eye-Tracking System for Accurate Measurement of Short-Latency Ocular Following Responses: Development and Observational StudyJMIR Pediatrics and Parenting10.2196/643537(e64353-e64353)Online publication date: 9-Dec-2024
  • (2024)Object Detection Algorithms in Embedded Systems and their ApplicationsFrontier Computing on Industrial Applications Volume 110.1007/978-981-99-9299-7_27(194-200)Online publication date: 21-Jan-2024
  • (2023)Production Evaluation of Citrus Fruits based on the YOLOv5 compressed by Knowledge Distillation2023 26th International Conference on Computer Supported Cooperative Work in Design (CSCWD)10.1109/CSCWD57460.2023.10152740(1938-1943)Online publication date: 24-May-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media