MobileNet-JDE: a lightweight multi-object tracking model for embedded systems

Tsai, Chi-Yi; Su, Yu-Kai

doi:10.1007/s11042-022-12095-9

MobileNet-JDE: a lightweight multi-object tracking model for embedded systems

Published: 14 February 2022

Volume 81, pages 9915–9937, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

753 Accesses
10 Citations
1 Altmetric
Explore all metrics

Abstract

Multi-object tracking (MOT) is one of the most challenging tasks in the field of computer vision. Although many MOT methods have been proposed in the literature, most of them cannot achieve real-time processing performance, especially those running on embedded platforms with limited computing resources. In this paper, we propose a real-time lightweight MOT method based on MobileNet to effectively improve the MOT processing speed. The proposed tracking method consists of a lightweight MOT model and a post-processing module. In the design of the lightweight MOT model, we have enhanced the lightweight object detection model proposed in our previous work by adding an appearance embedding layer. Moreover, we have also proposed a new anchor box design and a novel feature pyramid network (FPN) to improve the tracking accuracy of the proposed method. In the post-processing module, we have proposed a simple filtering method to replace the Kalman filter used in data association processing to accelerate the processing speed. Experimental results show that the proposed MOT method can reach to high processing speeds of 50.5 Frame-Per-Second (FPS) and 12.6 FPS when running on a desktop computer and an embedded platform, respectively. Moreover, the proposed MOT method also provides a competitive tracking performance when compared with the existing MOT methods. These advantages make the proposed method suitable for many applications running on embedded platforms, such as visual surveillance, visual tracking control of mobile robots, human-robot interaction, etc.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 8

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

Tausif Diwan, G. Anirudh & Jitendra V. Tembhurne

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

Ajantha Vijayakumar & Subramaniyaswamy Vairavasundaram

3D Object Detection for Autonomous Driving: A Comprehensive Survey

Article 27 April 2023

Jiageng Mao, Shaoshuai Shi, … Hongsheng Li

References

Ahmed S, Huda MN, Rajbhandari S, Saha C, Elshaw M, Kanarachos S (2019) Pedestrian and Cyclist Detection and Intent Estimation for Autonomous Vehicles: A Survey. Appl. Sci. 9(11)
Basar T. (2001) A New Approach to Linear Filtering and Prediction Problems. Control Theory: Twenty-Five Seminal Papers (1932–1981), Wiley-IEEE Press, pp.167–179
Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the CLEAR MOT metric. EURASIP J Image and Video Process 2008(1):246309–246310
Google Scholar
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B. (2016) Simple Online and Realtime Tracking. IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, pp. 3464–3468
Bewley A, Ott L, Ramos F, Upcroft B (2016) ALExTRAC: affinity learning by exploring temporal reinforcement within association chains. IEEE International Conference on Robotics and Automation, Stockholm, Sweden
Google Scholar
Chao, P., Kao, C., Ruan, Y., Huang, C., Lin, Y. (2019) HarDNet: A Low Memory Traffic Network. IEEE International Conference on Computer Vision (ICCV), Seoul, Korea (South), pp. 3551–3560
Chen, L., Ai, H., Zhuang, Z., Shang, C. (2018) Real-Time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-Identification. IEEE International Conference on Multimedia and Expo (ICME), San Diego, CA, pp. 1–6
Chiu, Y.-C., Tsai, C.-Y., Ruan, M.-D., Shen , G.-Y., Lee, T.-T. (2020) Mobilenet-SSDv2: An Improved Object Detection Model for Embedded Systems. International Conference on System Science and Engineering (ICSSE), Kagawa, Japan
Chu, Q., Ouyang, W., Li, H., Wang, X., Liu, B., Yu, N. (2017) Online Multi-object Tracking Using CNN-Based Single Object Tracker with Spatial-Temporal Attention Mechanism. IEEE International Conference on Computer Vision (ICCV), Venice, pp. 4846–4855
Dollar, P., Wojek, C., Schiele, B., Perona, P. (2009) Pedestrian Detection: A Benchmark. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, pp. 304–311
Ess A, Leibe B, Schindler K, Gool LV (2008) A Mobile vision system for robust multi-person tracking. IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA
Book Google Scholar
Everingham M, Eslami SMA, Gool LV, Williams CKI, Winn J, Zisserman A (2014) The PASCAL visual object classes challenge: a retrospective. Int J Comput Vis 111:98–136
Article Google Scholar
Fang, K., Xiang, Y., Li, X., Savarese, S. (2018) Recurrent Autoregressive Networks for Online Multi-object Tracking. IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, pp. 466–475
Frame Rate Guide for Video Surveillance (By IPVM Team, Published Jan 18, 2021): https://ipvm.com/reports/frame-rate-surveillance-guide
Girdhar, R., Gkioxari, G., Torresani, L., Paluri, M., Tran, D. (2018) Detect-and-Track: Efficient Pose Estimation in Videos. IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 350–359
Gόmez-Huélamo, C., Egido, J. D., Bergasa, L. M., Barea, R., Qcaña, M., Arango, F., Gutiérrez-Moreno, R. (2020) Real-Time Bird’s Eye View Multi-Object Tracking System Based on Fast Encoders for Object Detection. IEEE 23^rd International Conference on Intelligent Transportation Systems, Rhodes, Greece
Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C. (2020) GhostNet: More Features From Cheap Operations. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, pp. 1577–1586
He, K., Zhang, X., Ren, S., Sun, J. (2016) Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, pp. 770–778
Hosang, J., Benenson, R., Schiele, B. (2017) Learning Non-maximum Suppression. Computer Vision and Pattern Recognition, arXiv:1705.02950v2
Hosang, J., Benenson, R., Schiele, B. (2017) Learning Non-maximum Suppression. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 6469–6477
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H. (2017) MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. Computer Vision and Pattern Recognition, arXiv:1704.04861v1
Hu W, Li X, Luo W, Zhang X, Maybank S, Zhang Z (2012) Single and multiple object tracking using log-Euclidean Riemannian subspace and block-division appearance model. IEEE Trans Pattern Anal Mach Intell 34(12):2420–2440
Article Google Scholar
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K. Q. (2017) Densely Connected Convolutional Networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 2261–2269
Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., Keutzer, K. (2016) SqueezeNet: AlexNet-Level Accuracy with 50x Fewer Parameters and <0.5MB Model Size. International Conference on Learning Representations
Kalake L, Wan W, Hou L (2021) Analysis based on recent deep learning approaches applied in real-time multi-object tracking: a review. IEEE Access 9:32650–32671
Article Google Scholar
Kim, C., Li, F., Ciptadi, A., Rehg, J. M. (2015) Multiple Hypothesis Tracking Revisited. IEEE International Conference on Computer Vision (ICCV), Santiago, pp. 4696–4704
Kuhn, H.W. (1955) The Hungarian Method for the Assignment Problem. Naval Research Logistics Quarterly, pp. 83–97
Lee J, Kim S, Ko BC (2020) Online multiple object tracking using rule distillated Siamese random Forest. IEEE Assess 8:182828–182841
Google Scholar
Li, Y., Huang, C., Nevatia, R. (2009) Learning to Associate: HybridBoosted Multi-Target Tracker for Crowded Scene. IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, pp. 2953–2960
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A. C. (2016) SSD: Single Shot Multibox Detector. European Conference on Computer Vision, Amsterdam, Netherlands, pp. 21–37
Ma, N., Zhang, X., Zheng, H. T., Sun, J. (2018) ShuffleNet V2: practical guidelines for efficient CNN architecture design. European Conference on Computer Vision
Milan, A., Taixe, L. L., Reid, I., Roth, S., Schindler, K.(2016) MOT16: A Benchmark for Multi-Object Tracking. Computer Vision and Pattern Recognition, arXiv:1603.00831v2
MobileJDE Results: https://motchallenge.net/method/MOT=3378&chl=5 (n.d.)
MobileJDE_SF Results: https://motchallenge.net/method/MOT=3614&chl=5 (n.d.)
MOT16 Results: https://motchallenge.net/results/MOT16/?det=Private (n.d.)
Redmon, J., Farhadi, A. (2018) YOLOv3: An Incremental Improvement. Computer Vision and Pattern Recognition, arXiv:1804.02767v1
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Ristani, E., Solera, F., Zou, R. S., Cucchiara, R., Tomasi, C. (2016) Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking. European Conference on Computer Vision, pp.17–35
Sanchez-Matilla, R., Poiesi, F., Cavallaro, A. (2016) Online Multi-Target Tracking with Strong and Weak Detections. European Conference on Computer Vision, pp.84–99
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L. (2018) MobileNetV2: Inverted Residuals and Linear Bottlenecks. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, pp. 4510–4520
Simonyan, K., Zisserman, A. (2015) Very Deep Convolutional Networks for Large-Scale Image Recognition. Computer Vision and Pattern Recognition, arXiv:140931556v6
Tan, M., Pang, R., Le, Q. V. (2020) EfficientDet: Scalable and Efficient Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10781–10790
Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M. (2018) A Closer Look at Spatiotemporal Convolutions for Action Recognition. IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, pp. 6450–6459
Voigtlaender, P., Krause, M., Ošep, A., Luiten, J., Sekar, B.B.G., Geiger, A., Leibe, B. (2019) MOTS: Multi-Object Tracking and Segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 7934–7943
Wan, X., Wang, J., Kong, Z., Zhao, Q., Deng, S. (2018) Multi-Object Tracking Using Online Metric Learning with Long Short-Term Memory. IEEE International Conference on Image Processing (ICIP), Athens, pp. 788–792
Wang Q, Teng Z, Xing J, Gao J, Hu W, Maybank S: (2018) Multiple Object Tracking: A Literature Review. Comp Vision Patt Recogn, arXiv:1409.7618v4
Wang Z, Zheng L, Liu Y, Li Y, Wang S (2019) Towards Real-Time Multi-Object Tracking. Computer Vision and Pattern Recognition, arXiv:1909.12605v1
Wojke, N., Bewley, A., Paulus, D. (2017) Simple Online and Realtime Tracking with a Deep Association Metric. IEEE International Conference on Image Processing (ICIP), Beijing, pp. 3645–3649
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.(2017) Joint Detection and Identification Feature Learning for Person Search. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, pp. 3376–3385
Yang, B., Nevatia, R. (2012) Online learned discriminative partbased appearance models for multi-human tracking. 12th European Conference Computer Vision, pp. 484–498.
Yang, M., Yu, T., Wu, Y. (2007) Game-Theoretic Multiple Target Tracking. IEEE 11th International Conference on Computer Vision, Rio de Janeiro, pp. 1–8
Yoon, J. H., Yang, M., Lim, J., Yoon, K. (2015) Bayesian Multi-object Tracking Using Motion Context from Multiple Objects. IEEE Winter Conference on Applications of Computer Vision, Waikoloa, HI, pp. 33–40
Yu, F., Li, W., Li, Q., Liu, Y., Shi, X., Yan, J. (2016) POI: Multiple Object Tracking with High Performance Detection and Appearance Feature. European Conference on Computer Vision, pp.36–42
Zhang, L., van der Maaten, L. (2013) Structure Preserving Object Tracking. IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, pp. 1838–1845
Zhang L, van der Maaten L (2014) Preserving structure in model-free tracking. IEEE Trans Pattern Anal Mach Intell 36(4):756–769
Article Google Scholar
Zhang, S., Benenson, R., Schiele, B.: CityPersons (2017) A Diverse Dataset for Pedestrian Detection. Computer Vision and Pattern Recognition, arXiv:1702.05693v1
Zhang, X., Zhou, X., Lin , M., Sun, J. (2018) ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, pp. 6848–6856
Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: A Simple Baseline for Multi-Object Tracking. Computer Vision and Pattern Recognition, arXiv:2004.01888v4 (2020)
Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: a single-shot object detector based on multi-level feature pyramid network. Thirty-Third AAAI Conference on Artificial Intelligence, Honolulu, Hawaii, USA
Google Scholar
Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang , Y., Tian, Q. (2017) Person Re-identification in the Wild. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, pp. 3346–3355
Zhou, Z., Xing, J., Zhang, M., Hu, W. (2018) Online Multi-Target Tracking with Tensor-Based High-Order Graph Matching. International Conference on Pattern Recognition (ICPR), Beijing, pp. 1809–1814
Zhu, J., Yang, H., Liu, N., Kim, M., Zhang, W., Yang, M.-H. (2018) Online Multi-Object Tracking with Dual Matching Attention Networks. 15^th European Conference on Computer Vision (ECCV), Munich, Germany, pp. 379–396

Download references

Acknowledgments

The authors sincerely thank Professor Humaira Nisar from the Department of Electronics Engineering of Universiti Tunku Abdul Rahman, Malaysia, for participating in the revision of the manuscript. This research was supported by the Ministry of Science and Technology of Taiwan under Grant MOST 110-2221-E-032-047 and Grant MOST 109-2221-E-032-039.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Tamkang University, 151 Yingzhuan Road, Tamsui District, New Taipei City, 251, Taiwan, Republic of China
Chi-Yi Tsai & Yu-Kai Su

Authors

Chi-Yi Tsai
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Kai Su
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chi-Yi Tsai.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsai, CY., Su, YK. MobileNet-JDE: a lightweight multi-object tracking model for embedded systems. Multimed Tools Appl 81, 9915–9937 (2022). https://doi.org/10.1007/s11042-022-12095-9

Download citation

Received: 30 March 2021
Revised: 04 August 2021
Accepted: 03 January 2022
Published: 14 February 2022
Issue Date: March 2022
DOI: https://doi.org/10.1007/s11042-022-12095-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MobileNet-JDE: a lightweight multi-object tracking model for embedded systems

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

3D Object Detection for Autonomous Driving: A Comprehensive Survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

MobileNet-JDE: a lightweight multi-object tracking model for embedded systems

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

3D Object Detection for Autonomous Driving: A Comprehensive Survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation