Abstract
The multiple object tracking (MOT) task has always been a research hot point in computer vision. However, most current MOT algorithms do not pay enough attention to the prediction module. Also, in data association, they use manual debugging to determine the matching threshold. In this paper, we propose a new MOT algorithm. By introducing the Siamese RPN network as a predictor in the advanced detection module, the algorithm greatly enhances the adaptability to complex and diverse application scenarios while improving accuracy. Simultaneously, by analyzing the distance matrix in the data association module, we design a simple adaptive threshold determination method, which saves a lot of redundant experiments in the debugging process and avoids manual intervention. Combined with the self-designed matching strategy, the MOT algorithm with high accuracy and adaptability to more complex and diverse application scenarios such as nonlinear and high-speed is realized. Finally, the effectiveness and advantages of each module are verified on the MOT16, MOT17, and MOT20 benchmarks.
Similar content being viewed by others
References
Bergmann, P., Meinhardt, T., Leal-Taixé, L.: Tracking without bells and whistles. In: IEEE International Conference on Computer Vision (ICCV), pp. 941–951 (2019)
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J. Image Video Process 2008, 1–10 (2008)
Bertinetto, L., Henriques, J.F., Valmadre, J., Torr, P.H.S., Vedaldi, A.: Learning feed-forward one-shot learners. In: Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16), pp. 523–531 (2016)
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3464–3468 (2016)
Bochinski, E., Eiselein, V., Sikora, T.: High-speed tracking-by-detection without using image information. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6 (2017)
Bochinski, E., Senst, T., Sikora, T.: Extending IOU based multi-object tracking by visual information. In: 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6 (2018)
Bochkovskiy, A., Wang, C., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Brasó, G., Leal-Taixé, L.: Learning a neural solver for multiple object tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6246–6256 (2020)
Chen, L., Ai, H., Zhuang, Z., Shang, C.: Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In: 2018 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2018)
Han, S., Huang, P., Wang, H., Yu, E., Liu, D., Pan, X., Zhao, J.: MAT: motion-aware multi-object tracking. arXiv preprint arXiv:2009.04794 (2020)
Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng. 82D, 35–45 (1960)
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with Siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8971–8980 (2018)
Liu, J., Hou, Q., Cheng, M., Wang, C., Feng, J.: Improving convolutional networks with self-calibrated convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10093–10102 (2020)
Mahmoudi, N., Ahadi, S.M., Rahmati, M.: Multi-target tracking using CNN-based features: CNNMTT. Multimed. Tools Appl. 78(6), 7077–7096 (2019)
Milan, A., Leal-Taixe, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831 (2016)
Pang, B., Li, Y., Zhang, Y., Li, M., Lu, C.: TubeTK: adopting tubes to track multi-object in a one-step training model. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6307–6317 (2020)
Peng, J., Wang, C., Wan, F., Wu, Y., Wang, Y., Tai, Y., Wang, C., Li, J., Huang, F., Fu, Y.: Chained-tracker: chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. arXiv preprint arXiv:2007.14557v1 (2020)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1137–1149 (2017)
Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. arXiv preprint arXiv:1609.01775v2 (2016)
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. arXiv preprint arXiv:1911.09070 (2019)
Wang, B., Wang, L., Shuai, B., Zuo, Z., Liu, T., Chan, K.L., Wang, G.: Joint learning of convolutional neural networks and temporally constrained metrics for Tracklet association. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 386–393 (2016)
Wang, Q., Gao, J., Xing, J., Zhang, M., Hu, W.: DCFNet: discriminant correlation filters network for visual tracking. arXiv preprint arXiv:1704.04057 (2017)
Wang, G., Wang, Y., Zhang, H., Gu, R., Hwang, J.N.: Exploit the connectivity: multi-object tracking with trackletnet. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 482–490 (2019)
Wang, Z., Zheng, L., Liu, Y., Wang, S.: Towards real-time multi-object tracking. arXiv preprint arXiv:1909.12605 (2019)
Wang, Y., Weng, X., Kitani, K.: Joint detection and multi-object tracking with graph neural networks. arXiv preprint arXiv:2006.13164 (2020)
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE international conference on image processing (ICIP), pp. 3645–3649 (2017)
Xu, J., Cao, Y., Zhang, Z., Hu, H.: Spatial–temporal relation networks for multi-object tracking. In: Proceedings of the IEEE Conference on Computer Vision, pp. 3987–3997 (2019)
Xu, Y., Ban, Y., Delorme, G., Gan, C., Rus, D., Alameda-Pineda, X.: TransCenter: transformers with dense queries for multiple-object tracking. arXiv preprint arXiv:2103.15145 (2021)
Yu, F., Li, W., Li, Q., Liu, Y., Shi, X., Yan, J.: POI: multiple object tracking with high performance detection and appearance feature. In: European Conference on Computer Vision, pp. 36–42 (2016)
Zhang, Y., Wang C., Wang, X., Zeng, W., Liu, W.: A simple baseline for multi-object tracking. arXiv preprint arXiv:2004.01888v4 (2020)
Zhou, X., Koltun, V., Krhenbühl, P.: Tracking objects as points. arXiv preprint arXiv:2004.01177 (2020)
Zhu, J., Yang, H., Liu, N., Kim, M., Zhang, W., Yang, M.: Online multi-object tracking with dual matching attention networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 366–382 (2018)
Acknowledgments
This work was supported in part by the Science Technology Commission Project: intelligent identification and optimization of the control strategy for shield tunneling state (No. 18DZ1205502) and supported by the Science Technology Commission Project: risk analysis of urban viaduct traffic safety (No. 18DZ1201204).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Gao, X., Shen, Z. & Yang, Y. Multi-object tracking with Siamese-RPN and adaptive matching strategy. SIViP 16, 965–973 (2022). https://doi.org/10.1007/s11760-021-02041-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-021-02041-x