Skip to main content
Log in

SmartSORT: an MLP-based method for tracking multiple objects in real-time

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

With the recent advances in the object detection research field, tracking-by-detection has become the leading paradigm adopted by multi-object tracking algorithms. By extracting different features from detected objects, those algorithms can estimate the similarities and association patterns of objects along with successive frames. However, since similarity functions applied by tracking algorithms are handcrafted, it is difficult to use them in new contexts. In this study, it is investigated the use of artificial neural networks to learning a similarity function that can be used among detections. During training, multilayer perceptron (MLP) neural networks were introduced to correct and incorrect association patterns, sampled from a pedestrian tracking data set. For such, different motion and appearance feature combinations have been explored. Finally, a trained MLP has been inserted into a multiple-object tracking framework, which has been assessed on the MOT Challenge benchmark. Throughout the experiments, the proposed tracker matched the results obtained by state-of-the-art methods by scoring a tracking accuracy of 60.4%, while running 58% faster than DeepSORT, a recent and similar method used as a baseline. After all, this work demonstrates its method can be automatically trained for different tracking contexts and it has highly competitive cost-effectiveness for online real-time tracking applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: human trajectory prediction in crowded spaces. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 961–971. IEEE (2016)

  2. Arora, R., Basu, A., Mianjy, P., Mukherjee, A.: Understanding deep neural networks with rectified linear units. arXiv:abs/1611.01491 (2017)

  3. Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J. Image Video Process. 2008, 1–10 (2008)

    Article  Google Scholar 

  4. Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3464–3468 (2016)

  5. Bochinski, E., Eiselein, V., Sikora, T.: High-Speed tracking-by-detection without using image information. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6. IEEE (2017)

  6. Cun, Y.L.: A theoretical framework for back-propagation. In: D. Touretzky, G. Hinton, T. Sejnowski (eds.) Proceedings of the 1988 Connectionist Models Summer School, CMU, Pittsburg, PA, pp. 21–28. Morgan Kaufmann (1988)

  7. Fang, K., Xiang, Y., Li, X., Savarese, S.: Recurrent autoregressive networks for online multi-object tracking. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 466–475 (2018)

  8. Flach, P.: Machine Learning: The Art and Science of Algorithms That Make Sense of Data. Cambridge University Press, New York (2012)

    Book  Google Scholar 

  9. Geiger, A., Lauer, M., Wojek, C., Stiller, C., Urtasun, R.: 3D traffic scene understanding from movable platforms. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 1012–1025 (2014)

    Article  Google Scholar 

  10. Jadhav, A., Mukherjee, P., Kaushik, V., Lall, B.: Aerial multi-object tracking by detection using deep association networks (2019)

  11. Kuhn, H.W.: The Hungarian method for the assignment problem. Nav. Res. Logist. 52(1), 7–21 (2005)

    Article  Google Scholar 

  12. Leal-Taixé, L., Milan, A., Schindler, K., Cremers, D., Reid, I., Roth, S.: Tracking the trackers: an analysis of the state of the art in multiple object tracking. arXiv:abs/1704.02781 (2017)

  13. Li, P., Wang, D., Wang, L., Lu, H.: Deep visual tracking: review and experimental comparison. Pattern Recognit. 76, 323–338 (2018)

    Article  Google Scholar 

  14. Li Zhang, Yuan Li, Nevatia, R.: Global data association for multi-object tracking using network flows. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)

  15. Lin, Z., Zheng, H., Ke, B., Chen, L.: Online multi-object tracking based on hierarchical association and sparse representation. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 655–659. IEEE (2017)

  16. Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., Zhao, X., Kim, T.: Multiple object tracking: a literature review. arXiv:abs/1409.7618 (2014)

  17. Mahmoudi, N., Ahadi, S.M., Rahmati, M.: Multi-target tracking using CNN-based features: CNNMTT. Multimed Tools Appl 78, 7077–7096 (2019)

    Article  Google Scholar 

  18. Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: Mot16: A benchmark for multi-object tracking. arXiv:abs/1603.00831 (2016)

  19. Muñoz-Salinas, R., Aguirre, E., García-Silvente, M., Gonzalez, A.: A multiple object tracking approach that combines colour and depth information using a confidence measure. Pattern Recognit. Lett. 29(10), 1504–1514 (2008)

    Article  Google Scholar 

  20. Oron, S., Bar-Hillel, A., Avidan, S.: Real-time tracking-with-detection for coping with viewpoint change. Mach. Vis. Appl. 26(4), 507–518 (2015)

    Article  Google Scholar 

  21. Prabhavalkar, R., Alsharif, O., Bruguier, A., McGraw, L.: On the compression of recurrent neural networks with an application to lvcsr acoustic modeling for embedded speech recognition. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5970–5974 (2016)

  22. Punn, N.S., Sonbhadra, S.K., Agarwal, S.: Monitoring Covid-19 social distancing with person detection and tracking via fine-tuned YOLO v3 and Deepsort techniques. arXiv e-prints (2020)

  23. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv:abs/1804.02767 (2018)

  24. Martínez-del Rincón, J., Orrite, C., Medrano, C.: Rao-blackwellised particle filter for colour-based tracking. Pattern Recognit. Lett. 32(2), 210–220 (2011)

    Article  Google Scholar 

  25. Sadeghian, A., Alahi, A., Savarese, S.: Tracking the untrackable: learning to track multiple cues with long-term dependencies. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 300–311. IEEE (2017)

  26. Sanchez-Matilla, R., Poiesi, F., Cavallaro, A.: Online multi-target tracking with strong and weak detections. In: ECCV Workshops (2016)

  27. Son, J., Baek, M., Cho, M., Han, B.: Multi-object tracking with quadruplet convolutional neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3786–3795. IEEE (2017)

  28. Tang, S., Andres, B., Andriluka, M., Schiele, B.: Multi-person tracking by multicut and deep matching. ECCV 2016 Workshops (2016)

  29. Tao, X., Gong, Y., Shi, W., Cheng, D.: Object detection with class aware region proposal network and focused attention objective. Pattern Recognit. Lett. https://doi.org/10.1016/j.patrec.2018.09.025.

  30. Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3645–3649 (2017)

  31. Yang, Y., Liang, K., Xiao, X., Xie, Z., Jin, L., Sun, J., Zhou, W.: Accelerating and compressing LSTM based model for online handwritten Chinese character recognition. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 110–115 (2018)

  32. Yoon, J.H., Lee, C.R., Yang, M.H., Yoon, K.J.: Online multi-object tracking via structural constraint event aggregation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1392–1400. IEEE (2016)

  33. Yu, F., Li, W., Li, Q., Liu, Y., Shi, X., Yan, J.: Poi: multiple object tracking with high performance detection and appearance feature. In: ECCV 2016 Workshops (2016)

  34. Yu, H., Yu, H., Guo, H., Simmons, J., Zou, Q., Feng, W., Wang, S.: Multiple human tracking in wearable camera videos with informationless intervals. Pattern Recognit. Lett. 112, 104–110 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

This study has been financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code 001. This research was carried out using the computational resources of the Center for Mathematical Sciences Applied to Industry (CeMEAI) funded by FAPESP (Grant 2013/07375-0). The authors thank CNPq for granting a productivity scholarship to Hendrik Macedo [DT-II, Processo 306073/2017-0].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michel Meneses.

Ethics declarations

Conflict of interest

The authors declare that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meneses, M., Matos, L., Prado, B. et al. SmartSORT: an MLP-based method for tracking multiple objects in real-time. J Real-Time Image Proc 18, 913–921 (2021). https://doi.org/10.1007/s11554-020-01054-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-020-01054-y

Keywords

Navigation