Robust appearance modeling for object detection and tracking: a survey of deep learning approaches

Mumuni, Alhassan; Mumuni, Fuseini

doi:10.1007/s13748-022-00290-6

Robust appearance modeling for object detection and tracking: a survey of deep learning approaches

Review
Published: 06 September 2022

Volume 11, pages 279–313, (2022)
Cite this article

Progress in Artificial Intelligence Aims and scope Submit manuscript

Alhassan Mumuni¹ &
Fuseini Mumuni²

877 Accesses
Explore all metrics

Abstract

The task of object detection and tracking is one of the most complex and challenging problems in artificial intelligence (AI) systems that model perception. Object tracking has practical importance in AI applications like human–machine interaction, robotics, autonomous driving and extended reality. The fundamental task of object tracking is to detect objects in one video frame and maintain their identities or infer their trajectories across all subsequent frames. Real-world object tracking systems typically operate in highly complex and dynamic environments, with constantly changing object appearance and scene conditions, making it challenging to adequately characterize target objects with a single model. Traditional AI solutions rely on modeling handcrafted features based on rigorous mathematical formulations. This process is a highly non-trivial task and severely restricts end solutions to narrowly focused application settings. Today, deep learning techniques are the most preferred approaches due to their high generalization ability and ease of implementation. This paper surveys the most important deep learning-based appearance modeling techniques. We propose a unique taxonomy of approaches based on the architectural elements and auxiliary strategies that are employed in deep learning models for robust appearance modeling. The surveyed methodologies include data-centric techniques, compositional part modeling, similarity learning methods, memory and attention mechanisms, as well as approaches that integrate differentiable models within deep learning architectures to explicitly model spatial transformations. The fundamental principles, implementation details and application contexts, as well as the main strengths and potential limitations of the approaches are highlighted. We also present common datasets, evaluation metrics and performance results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals

Article Open access 04 March 2021

Deep Learning of Appearance Models for Online Object Tracking

Object Tracking Using Deep Convolutional Neural Networks and Visual Appearance Models

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Zhu, H., Wei, H., Li, B., Yuan, X., Kehtarnavaz, N.: A review of video object detection: datasets, metrics and methods. Appl. Sci. 10(21), 7834 (2020)
Article Google Scholar
Voigtlaender, P., Luiten, J., Torr, P.H., Leibe, B.: Siam r-cnn: visual tracking by re-detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6578–6588 (2020)
Elharrouss, O., Almaadeed, N., Al-Maadeed, S., Bouridane, A., Beghdadi, A.: A combined multiple action recognition and summarization for surveillance video sequences. Appl. Intell. 51(2), 690–712 (2021)
Article Google Scholar
Najeeb, H.D., Ghani, R.F.: A survey on object detection and tracking in soccer videos. MJPS 8(1), 1–13 (2021)
Article Google Scholar
Siddique, A., Medeiros, H.: Tracking passengers and baggage items using multi-camera systems at security checkpoints. arXiv preprint arXiv:2007.07924 (2020)
Krishna, V., Ding, Y., Xu, A., Höllerer, T.: Multimodal biometric authentication for VR/AR using EEG and eye tracking. In: Adjunct of the 2019 International Conference on Multimodal Interaction, pp. 1–5 (2019)
D’Ippolito, F., Massaro, M., Sferlazza, A.: An adaptive multi-rate system for visual tracking in augmented reality applications. In: IEEE 25th International Symposium on Industrial Electronics (ISIE), vol. 2016, pp. 355–361. IEEE (2016)
Guo, Z., Huang, Y., Hu, X., Wei, H., Zhao, B.: A survey on deep learning based approaches for scene understanding in autonomous driving. Electronics 10(4), 471 (2021)
Article Google Scholar
Moujahid, D., Elharrouss, O., Tairi, H.: Visual object tracking via the local soft cosine similarity. Pattern Recognit. Lett. 110, 79–85 (2018)
Article Google Scholar
Wang, N., Shi, J., Yeung, D.Y., Jia, J.: Understanding and diagnosing visual tracking systems. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3101–3109 (2015)
Li, X., Hu, W., Shen, C., Zhang, Z., Dick, A., Hengel, A.V.D.: A survey of appearance models in visual object tracking. ACM Trans. Intell. Syst. Technol. 4(4), 1–48 (2013)
Article Google Scholar
Dutta, A., Mondal, A., Dey, N., Sen, S., Moraru, L., Hassanien, A.E.: Vision tracking: a survey of the state-of-the-art. SN Comput. Sci. 1(1), 1–19 (2020)
Article Google Scholar
Walia, G.S., Kapoor, R.: Recent advances on multicue object tracking: a survey. Artif. Intell. Rev. 46(1), 1–39 (2016)
Article Google Scholar
Manafifard, M., Ebadi, H., Moghaddam, H.A.: A survey on player tracking in soccer videos. Comput. Vis. Image Underst. 159, 19–46 (2017)
Article Google Scholar
Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., Zhao, X, et al.: Multiple object tracking: a literature review. arXiv preprint arXiv:1409.7618 (2014)
SM, J.R., Augasta, G.: Review of recent advances in visual tracking techniques. Multimed. Tools Appl. 16, 24185–24203 (2021)
Google Scholar
Ciaparrone, G., Sánchez, F.L., Tabik, S., Troiano, L., Tagliaferri, R., Herrera, F.: Deep learning in video multi-object tracking: a survey. Neurocomputing 381, 61–88 (2020)
Article Google Scholar
Marvasti-Zadeh, S.M., Cheng, L., Ghanei-Yakhdan, H., Kasaei, S.: Deep learning for visual tracking: a comprehensive survey. IEEE Trans. Intell. Transp. Syst. 23, 3943–3968 (2021)
Article Google Scholar
Xu, Y., Zhou, X., Chen, S., Li, F.: Deep learning for multiple object tracking: a survey. IET Comput. Vis. 13(4), 355–368 (2019)
Article Google Scholar
Li, P., Wang, D., Wang, L., Lu, H.: Deep visual tracking: review and experimental comparison. Pattern Recognit. 76, 323–338 (2018)
Article Google Scholar
Sun, Z., Chen, J., Liang, C., Ruan, W., Mukherjee, M.: A survey of multiple pedestrian tracking based on tracking-by-detection framework. IEEE Trans. Circuits Syst. Video Technol. 31, 1819–1833 (2020)
Article Google Scholar
Fiaz, M., Mahmood, A., Jung, S.K.: Tracking noisy targets: a review of recent object tracking approaches. arXiv preprint arXiv:1802.03098 (2018)
Sugirtha, T., Sridevi, M.: A survey on object detection and tracking in a video sequence. In: Proceedings of International Conference on Computational Intelligence, pp. 15–29. Springer (2022)
Brunetti, A., Buongiorno, D., Trotta, G.F., Bevilacqua, V.: Computer vision and deep learning techniques for pedestrian detection and tracking: a survey. Neurocomputing 300, 17–33 (2018)
Article Google Scholar
Ravoor, P.C., Sudarshan, T.: Deep learning methods for multi-species animal re-identification and tracking—a survey. Comput. Sci. Rev. 38, 100289 (2020)
Article MathSciNet Google Scholar
Kamble, P.R., Keskar, A.G., Bhurchandi, K.M.: Ball tracking in sports: a survey. Artif. Intell. Rev. 52(3), 1655–1705 (2019)
Article Google Scholar
Fahmidha, R., Jose, S.K.: Vehicle and pedestrian video-tracking: a review. In: 2020 International Conference on Communication and Signal Processing (ICCSP), pp. 227–232. IEEE (2020)
Shukla, A., Saini, M.: Moving object tracking of vehicle detection: a concise review. Int. J. Signal Process. Image Process. Pattern Recognit. 8(3), 169–176 (2015)
Google Scholar
Karuppuchamy, S., Selvakumar, R.: A Survey and study on “vehicle tracking algorithms in video surveillance system”. In: 2017 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), pp. 1–4. IEEE (2017)
Kristan, M., Leonardis, A., Matas, J., Felsberg, M., Pflugfelder, R., Cehovin Zajc, L., et al.: The sixth visual object tracking vot2018 challenge results. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018)
Kristan, M., Matas, J., Leonardis, A., Felsberg, M., Pflugfelder, R., Kamarainen, J.K., et al.: The seventh visual object tracking vot2019 challenge results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Kristan, M., Leonardis, A., Matas, J., Felsberg, M., Pflugfelder, R., Kämäräinen, J.K., et al.: The eighth visual object tracking VOT2020 challenge results. In: European Conference on Computer Vision, pp. 547–601. Springer (2020)
Dendorfer, P., Osep, A., Milan, A., Schindler, K., Cremers, D., Reid, I., et al.: Motchallenge: a benchmark for single-camera multiple target tracking. Int. J. Comput Vis. 129(4), 845–881 (2021)
Article Google Scholar
Lan, L., Wang, X., Zhang, S., Tao, D., Gao, W., Huang, T.S.: Interacting tracklets for multi-object tracking. IEEE Trans. Image Process. 27(9), 4585–4597 (2018)
Article MathSciNet Google Scholar
Milan, A., Schindler, K., Roth, S.: Multi-target tracking by discrete-continuous energy minimization. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2054–2068 (2015)
Article Google Scholar
Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4293–4302 (2016)
Li, H., Li, Y., Porikli, F., et al.: DeepTrack: learning discriminative feature representations by convolutional neural networks for visual tracking. In: BMVC, vol. 1, p. 3 (2014)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., et al.: Ssd: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)
Song, Y., Ma, C., Gong, L., Zhang, J., Lau, R.W., Yang, M.H.: Crest: convolutional residual learning for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2555–2564 (2017)
Sharif Razavian, A., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 806–813 (2014)
Hong, S., You, T., Kwak, S., Han, B.: Online tracking by learning discriminative saliency map with convolutional neural network. In: International Conference on Machine Learning, pp. 597–606. PMLR (2015)
Tao, Q.Q., Zhan, S., Li, X.H., Kurihara, T.: Robust face detection using local CNN and SVM based on kernel combination. Neurocomputing 211, 98–105 (2016)
Article Google Scholar
Niu, X.X., Suen, C.Y.: A novel hybrid CNN-SVM classifier for recognizing handwritten digits. Pattern Recognit. 45(4), 1318–1325 (2012)
Article Google Scholar
Li, H., Li, Y., Porikli, F.: Deeptrack: learning discriminative feature representations online for robust visual tracking. IEEE Trans. Image Process. 25(4), 1834–1848 (2015)
Article MathSciNet MATH Google Scholar
Wang, N., Yeung, D.Y.: Learning a deep compact image representation for visual tracking. In: Advances in Neural Information Processing Systems (2013)
Zhou, K., Yang, Y., Hospedales, T., Xiang, T.: Deep domain-adversarial image generation for domain generalisation. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 34, pp. 13025–13032 (2020)
Bolme, D.S., Beveridge, J.R., Draper, B.A., Lui, Y.M., Visual object tracking using adaptive correlation filters. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition. vol. 2010, pp. 2544–2550. IEEE (2010)
Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Convolutional features for correlation filter based visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 58–66 (2015)
Zhang, F., Ma, S., Qiu, Z., Qi, T.: Learning target-aware background-suppressed correlation filters with dual regression for real-time UAV tracking. Signal Process. 191, 108352 (2022)
Article Google Scholar
Ma, C., Huang, J.B., Yang, X., Yang, M.H.: Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3074–3082 (2015)
Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., Torr, P.H.: Staple: complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1401–1409 (2016)
Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: European Conference on Computer Vision, pp. 254–265. Springer (2014)
Danelljan, M., Robinson, A., Khan, F.S., Felsberg, M.: Beyond correlation filters: Learning continuous convolution operators for visual tracking. In: European Conference on computer vision, pp. 472–488. Springer (2016)
Tang, S., Andriluka, M., Andres, B., Schiele, B.: Multiple people tracking by lifted multicut and person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3539–3548 (2017)
Kieritz, H., Hubner, W., Arens, M.: Joint detection and online multi-object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1459–1467 (2018)
Wang, Z., Zheng, L., Liu, Y., Wang, S.: Towards real-time multi-object tracking. arXiv preprint arXiv:1909.12605 (2019)
Sultana, F., Sufian, A., Dutta, P.: A review of object detection models based on convolutional neural network. In: Image Processing Based Applications, Intelligent Computing, pp. 1–16 (2020)
Henschel, R., Leal-Taixé, L., Cremers, D., Rosenhahn, B.: Fusion of head and full-body detectors for multi-object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 1428–1437 (2018)
Chu, P., Wang, J., You, Q., Ling, H., Liu, Z.: TransMOT: spatial-temporal graph transformer for multiple object tracking. arXiv preprint arXiv:2104.00194 (2021)
Yu, E., Li, Z., Han, S., Wang, H.: RelationTrack: relation-aware multiple object tracking with decoupled representation. arXiv preprint arXiv:2105.04322 (2021)
Pang, J., Qiu, L., Li, X., Chen, H., Li, Q., Darrell T, et al.: Quasi-dense similarity learning for multiple object tracking. arXiv preprint arXiv:2006.06664 (2020)
Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: FairMOT: on the fairness of detection and re-identification in multiple object tracking. arXiv preprint arXiv:2004.01888 (2020)
Ullah, M., Cheikh, F.A.: Deep feature based end-to-end transportation network for multi-target tracking. In: 25th IEEE International Conference on Image Processing (ICIP), vol. 2018, pp. 3738-3742. IEEE (2018)
Ren, L., Lu, J., Wang, Z., Tian, Q., Zhou, J.: Collaborative deep reinforcement learning for multi-object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 586–602 (2018)
Leal-Taixé, L., Canton-Ferrer, C., Schindler, K.: Learning by tracking: siamese CNN for robust target association. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 33–40 (2016)
Zhang, S., Gong, Y., Huang, J.B., Lim, J., Wang, J., Ahuja, N., et al.: Tracking persons-of-interest via adaptive discriminative features. In: European Conference on Computer Vision, pp. 415–433. Springer (2016)
Chen, L., Ai, H., Shang, C., Zhuang, Z., Bai, B.: Online multi-object tracking with convolutional neural networks. In: 2017 IEEE international conference on image processing (ICIP), pp. 645–649. IEEE (2017)
Xu, Y., Osep, A., Ban, Y., Horaud, R., Leal-Taixé, L., Alameda-Pineda, X.: How to train your deep multi-object tracker. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6787–6796 (2020)
Yoon, Y.C., Kim, D.Y., Song, Y.M., Yoon, K., Jeon, M.: Online multiple pedestrians tracking using deep temporal appearance matching association. Inf. Sci. 561, 326–351 (2021)
Article MathSciNet Google Scholar
Bergmann, P., Meinhardt, T., Leal-Taixe, L.: Tracking without bells and whistles. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 941–951 (2019)
Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points. In: European Conference on Computer Vision, pp. 474–490. Springer (2020)
Jia, Y.J., Lu, Y., Shen, J., Chen, Q.A., Chen, H., Zhong, Z., et al.: Fooling detection alone is not enough: adversarial attack against multiple object tracking. In: International Conference on Learning Representations (ICLR’20) (2020)
Feichtenhofer, C., Pinz, A., Zisserman, A.: Detect to track and track to detect. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3038–3046 (2017)
Lu, Z., Rathod, V., Votel, R., Huang, J.: Retinatrack: online single stage joint detection and tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14668–14678 (2020)
Wu, J., Cao, J., Song, L., Wang, Y., Yang, M., Yuan, J.: Track to detect and segment: an online multi-object tracker. arXiv preprint arXiv:2103.08808 (2021)
Chaabane, M., Zhang, P., Beveridge, J.R., O’Hara, S.: DEFT: detection embeddings for tracking. arXiv preprint arXiv:2102.02267 (2021)
Sampath, V., Maurtua, I., Martín, J.J.A., Gutierrez, A.: A survey on generative adversarial networks for imbalance problems in computer vision tasks. J. Big Data. 8(1), 1–59 (2021)
Article Google Scholar
Krawczyk, B.: Learning from imbalanced data: open challenges and future directions. Progress Artif. Intell. 5(4), 221–232 (2016)
Article Google Scholar
Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 101–117 (2018)
Song, Y., Ma, C., Wu, X., Gong, L., Bao, L., Zuo, W., et al.: Vital: visual tracking via adversarial learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8990–8999 (2018)
Bhat, G., Johnander, J., Danelljan, M., Khan, F.S., Felsberg, M.: Unveiling the power of deep tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 483–498 (2018)
Wang, Y., Wei, X., Tang, X., Shen, H., Ding, L.: CNN tracking based on data augmentation. Knowl.-Based Syst. 194, 105594 (2020)
Article Google Scholar
Neuhausen, M., Herbers, P., König, M.: Synthetic data for evaluating the visual tracking of construction workers. In: Construction Research Congress 2020: Computer Applications, pp. 354–361. American Society of Civil Engineers Reston, VA (2020)
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4340–4349 (2016)
Shermeyer, J., Hossler, T., Van Etten, A., Hogan, D., Lewis, R., Kim, D.: Rareplanes: synthetic data takes flight. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 207–217 (2021)
Han, Y., Zhang, P., Huang, W., Zha, Y., Cooper, G., Zhang, Y.: Robust visual tracking using unlabeled adversarial instance generation and regularized label smoothing. Pattern Recognit. 1–15 (2019)
Cheng, X., Song, C., Gu, Y., Chen, B.: Learning attention for object tracking with adversarial learning network. EURASIP J. Image Video Process. 2020(1), 1–21 (2020)
Article Google Scholar
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al.: Generative adversarial networks. arXiv preprint arXiv:1406.2661 (2014)
Han, Y., Zhang, P., Huang, W., Zha, Y., Cooper, G.D., Zhang, Y.: Robust visual tracking based on adversarial unlabeled instance generation with label smoothing loss regularization. Pattern Recognit. 97, 107027 (2020)
Article Google Scholar
Yin, Y., Xu, D., Wang, X., Zhang, L.: Adversarial feature sampling learning for efficient visual tracking. IEEE Trans. Autom. Sci. Eng. 17(2), 847–857 (2019)
Article Google Scholar
Wang, F., Wang, X., Tang, J., Luo, B., Li, C.: VTAAN: visual tracking with attentive adversarial network. Cognit. Comput. 13, 646–656 (2020)
Article Google Scholar
Javanmardi, M., Qi, X.: Appearance variation adaptation tracker using adversarial network. Neural Netw. 129, 334–343 (2020)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Kim, H.I., Park, R.H.: Siamese adversarial network for object tracking. Electron. Lett. 55(2), 88–90 (2018)
Article Google Scholar
Wang, X., Li, C., Luo, B., Tang, J.: Sint++: Robust visual tracking via adversarial positive instance generation. In: Proceedings of the IEEE Conference on Computer Vision and pattern recognition, pp. 4864–4873 (2018)
Guo, J., Xu, T., Jiang, S., Shen, Z.: Generating reliable online adaptive templates for visual tracking. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 226–230. IEEE (2018)
Wu, Q., Chen, Z., Cheng, L., Yan, Y., Li, B., Wang, H.: Hallucinated adversarial learning for robust visual tracking. arXiv preprint arXiv:1906.07008 (2019)
Kim, Y., Shin, J., Park, H., Paik, J.: Real-time visual tracking with variational structure attention network. Sensors 19(22), 4904 (2019)
Article Google Scholar
Lin, C.C., Hung, Y., Feris, R., He, L.: Video instance segmentation tracking with a modified vae architecture. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13147–13157 (2020)
Cheng, X., Zhang, Y., Zhou, L., Zheng, Y.: Visual tracking via auto-encoder pair correlation filter. IEEE Trans. Ind. Electron. 67(4), 3288–3297 (2019)
Article Google Scholar
Wang, L., Pham, N.T., Ng, T.T., Wang, G., Chan, K.L., Leman, K.: Learning deep features for multiple object tracking by using a multi-task learning strategy. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 838–842. IEEE (2014)
Liu, P., Li, X., Liu, H., Fu, Z.: Online learned Siamese network with auto-encoding constraints for robust multi-object tracking. Electronics 8(6), 595 (2019)
Article Google Scholar
Xu, L., Niu, R.: Semi-supervised visual tracking based on variational siamese network. In: International Conference on Dynamic Data Driven Application Systems, pp. 328–336. Springer (2020)
Tao, R., Gavves, E., Smeulders, AW.: Siamese instance search for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1420–1429 (2016)
Hariharan, B., Girshick, R.: Low-shot visual recognition by shrinking and hallucinating features. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3018–3027 (2017)
Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer gan to bridge domain gap for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 79–88 (2018)
Li, K., Zhang, Y., Li, K., Fu, Y.: Adversarial feature hallucination networks for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13470–13479 (2020)
Schwartz, E., Karlinsky, L., Shtok, J., Harary, S., Marder, M., Feris, R., et al.: Delta-encoder: an effective sample synthesis method for few-shot object recognition. arXiv preprint arXiv:1806.04734 (2018)
Amirkhani, A., Barshooi, A.H., Ebrahimi, A.: Enhancing the robustness of visual object tracking via style transfer. Comput. Mater. Contin. 70(1), 981–997 (2022)
Google Scholar
López-Sastre, R.J., Tuytelaars, T., Savarese, S.: Deformable part models revisited: A performance evaluation for object category pose estimation. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1052–1059. IEEE (2011)
Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., Urtasun, R.: Monocular 3d object detection for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2147–2156 (2016)
Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. Int. J. Comput. Vis. 61(1), 55–79 (2005)
Article Google Scholar
Papandreou, G., Zhu, T., Chen, L.C., Gidaris, S., Tompson, J., Murphy, K.: Personlab: person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 269–286 (2018)
Rad, M., Lepetit, V.: Bb8: a scalable, accurate, robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3828–3836 (2017)
Wang, B., Wang, L., Shuai, B., Zuo, Z., Liu, T., Luk Chan, K., et al.: Joint learning of convolutional neural networks and temporally constrained metrics for tracklet association. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–8 (2016)
Uricár, M., Franc, V., Hlavác, V.: Facial landmark tracking by tree-based deformable part model based detector. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 10–17 (2015)
Crivellaro, A., Rad, M., Verdie, Y., Moo Yi, K., Fua, P., Lepetit, V.: A novel representation of parts for accurate 3D object detection and tracking in monocular images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4391–4399 (2015)
Li, J., Wong, H.C., Lo, S.L., Xin, Y.: Multiple object detection by a deformable part-based model and an R-CNN. IEEE Signal Process. Lett. 25(2), 288–292 (2018)
Article Google Scholar
De Ath, G., Everson, R.M.: Part-based tracking by sampling. arXiv preprint arXiv:1805.08511 (2018)
Liu, W., Sun, X., Li, D.: Robust object tracking via online discriminative appearance modeling. EURASIP J. Adv. Signal Process. 2019(1), 1–9 (2019)
Article Google Scholar
Wang, G., Yuan, Y., Chen, X., Li, J., Zhou, X.: Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 274–282 (2018)
Tian, Y., Luo, P., Wang, X., Tang, X.: Deep learning strong parts for pedestrian detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1904–1912 (2015)
Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., Yuille, A.: Detect what you can: detecting and representing objects using holistic models and body parts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1971–1978 (2014)
Gao, J., Zhang, T., Yang, X., Xu, C.: P2t: part-to-target tracking via deep regression learning. IEEE Trans. Image Process. 27(6), 3074–3086 (2018)
Article MathSciNet Google Scholar
Lim, J.J., Dollar, P., Zitnick III, C.L.: Learned mid-level representation for contour and object detection. Google Patents; 2014. US Patent App. 13/794,857
Wang, S., Lu, H., Yang, F., Yang, M.H.: Superpixel tracking. In: 2011 International Conference on Computer Vision, pp. 1323–1330. IEEE (2011)
Lee, S.H., Jang, W.D., Kim, C.S.: Tracking-by-segmentation using superpixel-wise neural network. IEEE Access 6, 54982–54993 (2018)
Article Google Scholar
Yang, F., Lu, H., Yang, M.H.: Robust superpixel tracking. IEEE Trans. Image Process. 23(4), 1639–1651 (2014)
Article MathSciNet MATH Google Scholar
Verelst, T., Blaschko, M., Berman, M.: Generating superpixels using deep image representations. arXiv preprint arXiv:1903.04586 (2019)
Jampani, V., Sun, D., Liu, M.Y., Yang, M.H., Kautz, J.: Superpixel sampling networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 352–368 (2018)
Yang, F., Sun, Q., Jin, H., Zhou, Z.: Superpixel segmentation with fully convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13964–13973 (2020)
Yang, X., Wei, Z., Wang, N., Song, B., Gao, X.: A novel deformable body partition model for MMW suspicious object detection and dynamic tracking. Signal Process. 174, 107627 (2020)
Article Google Scholar
Liu, W., Song, Y., Chen, D., He, S., Yu, Y., Yan, T., et al.: Deformable object tracking with gated fusion. IEEE Trans. Image Process. 28(8), 3766–3777 (2019)
Article MathSciNet MATH Google Scholar
Girshick, R., Felzenszwalb, P., McAllester, D.: Object detection with grammar models. Adv. Neural Inf. Process. Syst. 24, 442–450 (2011)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)
Article Google Scholar
Azizpour, H., Laptev, I.: Object detection using strongly-supervised deformable part models. In: European Conference on Computer Vision, pp. 836–849. Springer (2012)
Ouyang, W., Wang, X.: Single-pedestrian detection aided by multi-pedestrian detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3198–3205 (2013)
Nam, H., Baek, M., Han, B.: Modeling and propagating cnns in a tree structure for visual tracking. arXiv preprint arXiv:1608.07242 (2016)
Wang, J., Fei, C., Zhuang, L., Yu, N.: Part-based multi-graph ranking for visual tracking. In: 2016 IEEE International Conference on Image Processing (ICIP). IEEE; 2016. p. 1714–1718
Du, D., Wen, L., Qi, H., Huang, Q., Tian, Q., Lyu, S.: Iterative graph seeking for object tracking. IEEE Trans. Image Process. 27(4), 1809–1821 (2017)
Article MathSciNet MATH Google Scholar
Du, D., Qi, H., Li, W., Wen, L., Huang, Q., Lyu, S.: Online deformable object tracking based on structure-aware hyper-graph. IEEE Trans. Image Process. 25(8), 3572–3584 (2016)
Article MathSciNet MATH Google Scholar
Wang, L., Lu, H., Yang, M.H.: Constrained superpixel tracking. IEEE Trans. Cybern. 48(3), 1030–1041 (2017)
Article Google Scholar
Jianga, B., Zhang, P., Huang, L.: Visual object tracking by segmentation with graph convolutional network. arXiv preprint arXiv:2009.02523 (2020)
Parizi, S.N., Vedaldi, A., Zisserman, A., Felzenszwalb, P.: Automatic discovery and optimization of parts for image classification. arXiv preprint arXiv:1412.6598 (2014)
Li, Y., Liu, L., Shen, C., Van Den Hengel, A.: Mining mid-level visual patterns with deep CNN activations. Int. J. Comput. Vis. 121(3), 344–364 (2017)
Article MathSciNet MATH Google Scholar
Girshick, R., Iandola, F., Darrell, T., Malik, J.: Deformable part models are convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 437–446 (2015)
Sun, Y., Zheng, L., Li, Y., Yang, Y., Tian, Q., Wang, S.: Learning part-based convolutional features for person re-identification. IEEE Trans. Pattern Anal. Mach. Intell. 43, 902–917 (2019)
Article Google Scholar
Qi, Y., Zhang, S., Qin, L., Yao, H., Huang, Q., Lim, J., et al.: Hedged deep tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4303–4311 (2016)
Mordan, T., Thome, N., Henaff, G., Cord, M.: End-to-end learning of latent deformable part-based representations for object detection. Int. J. Comput. Vis. 127(11), 1659–1679 (2019)
Article Google Scholar
Zhang, Z., Xie, C., Wang, J., Xie, L., Yuille, A.L.: Deepvoting: a robust and explainable deep network for semantic part detection under partial occlusion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1372–1380 (2018)
Mordan, T., Thome, N., Cord, M., Henaff, G.: Deformable part-based fully convolutional network for object detection. arXiv preprint arXiv:1707.06175 (2017)
Jifeng, D., Yi, L., Kaiming, H., Jian, S.: Object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems, pp. 379–387 (2016)
Ouyang, W., Zeng, X., Wang, X., Qiu, S., Luo, P., Tian, Y., et al.: DeepID-Net: object detection with deformable part based convolutional neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(7), 1320–1334 (2016)
Article Google Scholar
Yang, L., Xie, X., Li, P., Zhang, D., Zhang, L.: Part-based convolutional neural network for visual recognition. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 1772–1776. IEEE (2017)
Wang, J., Xie, C., Zhang, Z., Zhu, J., Xie, L., Yuille, A.: Detecting semantic parts on partially occluded objects. arXiv preprint arXiv:1707.07819 (2017)
Wang, J., Zhang, Z., Xie, C., Premachandran, V., Yuille, A.: Unsupervised learning of object semantic parts from internal states of cnns by population encoding. arXiv preprint arXiv:1511.06855 (2015)
Li, Y., Liu, L., Shen, C., van den Hengel, A.: Mid-level deep pattern mining. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 971–980 (2015)
Zhang, Q., Wu, Y.N., Zhu, S.C.: Interpretable convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8827–8836 (2018)
Stone, A., Wang, H., Stark, M., Liu, Y., Scott Phoenix, D., George, D.: Teaching compositionality to cnns. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5058–5067 (2017)
Ouyang, W., Wang, X.: Joint deep learning for pedestrian detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2056–2063 (2013)
Zhu, F., Kong, X., Zheng, L., Fu, H., Tian, Q.: Part-based deep hashing for large-scale person re-identification. IEEE Trans. Image Process. 26(10), 4806–4817 (2017)
Article MathSciNet Google Scholar
Wu, G., Lu, W., Gao, G., Zhao, C., Liu, J.: Regional deep learning model for visual tracking. Neurocomputing 175, 310–323 (2016)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105 (2012)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833. Springer (2014)
Dinov, I.D. Black box machine-learning methods: Neural networks and support vector machines. In: Data Science and Predictive Analytics, pp. 383–422. Springer (2018)
Mozhdehi, R.J., Medeiros, H.: Deep convolutional particle filter for visual tracking. In: IEEE International Conference on Image Processing (ICIP), vol. 2017, pp. 3650–3654. IEEE (2017)
Yang, B., Hu, X., Wang, F.: Kernel correlation filters based on feature fusion for visual tracking. J. Phys. Conf. Ser. 1601, 052026 (2020)
Article Google Scholar
Yang, Y., Liao, S., Lei, Z., Li, S.: Large scale similarity learning using similar pairs for person verification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)
Hirzer, M., Roth, P.M., Köstinger, M., Bischof, H.: Relaxed pairwise learned metric for person re-identification. In: European Conference on Computer Vision, pp. 780–793. Springer (2012)
Kulis, B., et al.: Metric learning: a survey. Found. Trends Mach. Learn. 5(4), 287–364 (2012)
Article MATH Google Scholar
Jia, Y., Darrell, T.: Heavy-tailed distances for gradient based image descriptors. Adv. Neural Inf. Process. Syst. 24, 397–405 (2011)
Google Scholar
Simonyan, K., Vedaldi, A., Zisserman, A.: Learning local feature descriptors using convex optimisation. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1573–1585 (2014)
Article Google Scholar
Tian, S., Shen, S., Tian, G., Liu, X., Yin, B.: End-to-end deep metric network for visual tracking. Vis. Comput. 36(6), 1219–1232 (2020)
Article Google Scholar
Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P, et al. Supervised contrastive learning. arXiv preprint arXiv:2004.11362 (2020)
Zhao, R., Ouyang, W., Wang, X.: Learning mid-level filters for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 144–151 (2014)
Paisitkriangkrai, S., Shen, C., Van Den Hengel, A.: Learning to rank in person re-identification with metric ensembles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1846–1855 (2015)
Yang, W., Liu, Y., Zhang, Q., Zheng, Y.: Comparative object similarity learning-based robust visual tracking. IEEE Access 7, 50466–50475 (2019)
Article Google Scholar
Zhou, Y., Bai, X., Liu, W., Latecki, L.J.: Similarity fusion for visual tracking. Int. J. Comput. Vis. 118(3), 337–363 (2016)
Article MathSciNet Google Scholar
Ning, J., Shi, H., Ni, J., Fu, Y.: Single-stream deep similarity learning tracking. IEEE Access 7, 127781–127787 (2019)
Article Google Scholar
Chicco, D.: Siamese neural networks: an overview. In: Artificial Neural Networks, pp. 73–94 (2021)
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “siamese” time delay neural network. Adv. Neural Inf. Process. Syst. 6, 737–744 (1993)
Vaquero, L., Brea, V.M., Mucientes, M.: Tracking more than 100 arbitrary objects at 25 FPS through deep learning. Pattern Recognit. 121, 108205 (2022)
Article Google Scholar
Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M.M., Hicks, S.L., et al.: Struck: structured output tracking with kernels. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2096–2109 (2015)
Article Google Scholar
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.: Fully-convolutional siamese networks for object tracking. In: European Conference on Computer Vision, pp. 850–865. Springer (2016)
Valmadre, J., Bertinetto, L., Henriques, J., Vedaldi, A., Torr, P.H.: End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2805–2813 (2017)
Held, D., Thrun, S., Savarese, S.: Learning to track at 100 fps with deep regression networks. In: European Conference on Computer Vision, pp. 749–765. Springer (2016)
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8971–8980 (2018)
Fan, H., Ling, H.: Siamese cascaded region proposal networks for real-time visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7952–7961 (2019)
He, A., Luo, C., Tian, X., Zeng, W.: A twofold siamese network for real-time object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4834–4843 (2018)
Zha, Y., Wu, M., Qiu, Z., Yu, W.: Visual tracking based on semantic and similarity learning. IET Comput. Vis. 13(7), 623–631 (2019)
Article Google Scholar
Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4353–4361 (2015)
Hoffer, E., Ailon, N.: Deep metric learning using triplet network. In: International Workshop on Similarity-Based Pattern Recognition, pp. 84–92. Springer (2015)
Liu, Y., Zhang, L., Chen, Z., Yan, Y., Wang, H.: Multi-stream siamese and faster region-based neural network for real-time object tracking. IEEE Trans. Intell. Transp. Syst. 22, 7279–7292 (2020)
Article Google Scholar
Dong, X., Shen, J.: Triplet loss in siamese network for object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 459–474 (2018)
Li, K., Kong, Y., Fu, Y.: Visual object tracking via multi-stream deep similarity learning networks. IEEE Trans. Image Process. 29, 3311–3320 (2019)
Article MATH Google Scholar
Jeany, S., Mooyeol, B., Cho, M., Han, B.: Multi-Object Tracking with Quadruplet Convolutional Neural Networks. IEEE Computer Society (2017)
Son, J., Baek, M., Cho, M., Han, B.: Multi-object tracking with quadruplet convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5620–5629 (2017)
Zhang, D., Zheng, Z.: Joint representation learning with deep quadruplet network for real-time visual tracking. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2020)
Chen, W., Chen, X., Zhang, J., Huang, K.: Beyond triplet loss: a deep quadruplet network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 403–412 (2017)
Dike, H.U., Zhou, Y.: A robust quadruplet and faster region-based CNN for UAV video-based multiple object tracking in crowded environment. Electronics 10(7), 795 (2021)
Article Google Scholar
Wu, C., Zhang, Y., Zhang, W., Wang, H., Zhang, Y., Zhang, Y., et al.: Motion guided siamese trackers for visual tracking. IEEE Access 8, 7473–7489 (2020)
Article Google Scholar
Guo, Q., Feng, W., Zhou, C., Huang, R., Wan, L., Wang, S.; Learning dynamic siamese network for visual object tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1763–1771 (2017)
Yang, T., Chan, A.B.: Learning dynamic memory networks for object tracking. In: Proceedings of the European Conference on computer vision (ECCV), pp. 152–167 (2018)
Shi, T., Wang, D., Ren, H.: Triplet network template for siamese trackers. IEEE Access 9, 44426–44435 (2021)
Article Google Scholar
Guo, D., Wang, J., Cui, Y., Wang, Z., Chen, S.: SiamCAR: siamese fully convolutional classification and regression for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6269–6277 (2020)
Kim, M., Alletto, S., Rigazio, L.: Similarity mapping with enhanced siamese network for multi-object tracking. arXiv preprint arXiv:1609.09156 (2016)
Ma, C., Yang, C., Yang, F., Zhuang, Y., Zhang, Z., Jia, H., et al.: Trajectory factory: tracklet cleaving and re-connection by deep siamese bi-gru for multiple object tracking. In: 2018 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2018)
Lee, S., Kim, E.: Multiple object tracking via feature pyramid siamese networks. IEEE Access 7, 8181–8194 (2018)
Article Google Scholar
Liang, Y., Zhou, Y.: LSTM multiple object tracker combining multiple cues. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 2351–2355. IEEE (2018)
Ma, L., Tang, S., Black, M.J., Van Gool, L.: Customized multi-person tracker. In: Asian Conference on Computer Vision, pp. 612–628. Springer (2018)
Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K.: Recurrent models of visual attention. arXiv preprint arXiv:1406.6247 (2014)
Jenni, S., Jin, H., Favaro, P.: Steering self-supervised feature learning beyond local pixel statistics. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6408–6417 (2020)
Chu, Q., Ouyang, W., Li, H., Wang, X., Liu, B., Yu, N.: Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4836–4845 (2017)
Fiaz, M., Mahmood, A., Baek, K.Y., Farooq, S.S., Jung, S.K.: Improving object tracking by added noise and channel attention. Sensors 20(13), 3780 (2020)
Article Google Scholar
Kim, C., Li, F., Rehg, J.M.: Multi-object tracking with neural gating using bilinear lstm. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 200–215 (2018)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Zhao, F., Zhang, T., Wu, Y., Tang, M., Wang, J.: Antidecay LSTM for siamese tracking with adversarial learning. IEEE Trans. Neural Netw. Learn. Syst. 32, 4475–4489 (2020)
Article Google Scholar
Chen, X., Gupta, A.: Spatial memory for context reasoning in object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4086–4096 (2017)
Li, J., Wei, Y., Liang, X., Dong, J., Xu, T., Feng, J., et al.: Attentive contexts for object detection. IEEE Trans. Multimed. 19(5), 944–954 (2016)
Article Google Scholar
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)
Ba, J., Mnih, V., Kavukcuoglu, K.: Multiple object recognition with visual attention. arXiv preprint arXiv:1412.7755 (2014)
Kosiorek, A.R., Bewley, A., Posner, I.: Hierarchical attentive recurrent tracking. arXiv preprint arXiv:1706.09262 (2017)
Cui, Z., Xiao, S., Feng, J., Yan, S.: Recurrently target-attending tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1449–1458 (2016)
Milan, A., Rezatofighi, S.H., Dick, A., Reid, I., Schindler, K.: Online multi-target tracking using recurrent neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)
Quan, R., Zhu, L., Wu, Y., Yang, Y.: Holistic LSTM for pedestrian trajectory prediction. IEEE Trans. Image Process. 30, 3229–3239 (2021)
Article Google Scholar
Shu, X., Tang, J., Qi, G., Liu, W., Yang, J.: Hierarchical long short-term concurrent memory for human interaction recognition. IEEE Trans. Pattern Anal. Mach. Intell. 43, 1110–1118 (2019)
Article Google Scholar
Fang, K., Xiang, Y., Li, X., Savarese, S.: Recurrent autoregressive networks for online multi-object tracking. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 466–475. IEEE (2018)
Zhang, S., Yang, J., Schiele, B.: Occluded pedestrian detection through guided attention in cnns. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6995–7003 (2018)
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Stollenga, M., Masci, J., Gomez, F., Schmidhuber, J.: Deep networks with internal selective attention through feedback connections. arXiv preprint arXiv:1407.3068 (2014)
Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: International Conference on Machine Learning, pp. 1319–1327. PMLR (2013)
Zhao M, Okada K, Inaba M. TrTr: Visual tracking with transformer. arXiv preprint arXiv:2105.03817 (2021)
Xu, Y., Ban, Y., Delorme, G., Gan, C., Rus, D., Alameda-Pineda, X.: TransCenter: transformers with dense queries for multiple-object tracking. arXiv preprint arXiv:2103.15145 (2021)
Zeng, F., Dong, B., Wang, T., Chen, C., Zhang, X., Wei, Y.: MOTR: end-to-end multiple-object tracking with transformer. arXiv preprint arXiv:2105.03247 (2021)
Sun, P., Jiang, Y., Zhang, R., Xie, E., Cao, J., Hu, X., et al. Transtrack: multiple-object tracking with transformer. arXiv preprint arXiv:2012.15460. (2020)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Meinhardt, T., Kirillov, A., Leal-Taixe, L., Feichtenhofer, C.: Trackformer: multi-object tracking with transformers. arXiv preprint arXiv:2101.02702 (2021)
Chen, Y., Cao, Y., Hu, H., Wang, L.: Memory enhanced global-local aggregation for video object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10337–10346 (2020)
Xiao, F., Lee, Y.J.: Spatial-temporal memory networks for video object detection. arXiv preprint arXiv:1712.06317 (2017)
Deng, H., Hua, Y., Song, T., Zhang, Z., Xue, Z., Ma, R., et al.: Object guided external memory network for video object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6678–6687 (2019)
Wang, L., Zhang, L., Wang, J., Yi, Z.: Memory mechanisms for discriminative visual tracking algorithms with deep neural networks. IEEE Transactions on Cognitive and Developmental Systems. 12(1), 98–108 (2019)
Article Google Scholar
Jeon, S., Kim, S., Min, D., Sohn, K.: Parn: pyramidal affine regression networks for dense semantic correspondence. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 351–366 (2018)
Xie, Y., Shen, J., Wu, C.: Affine geometrical region CNN for object tracking. IEEE Access 8, 68638–68648 (2020)
Article Google Scholar
Vu, H.T., Huang, C.C.: A multi-task convolutional neural network with spatial transform for parking space detection. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 1762–1766. IEEE (2017)
Zhou, Q., Zhong, B., Zhang, Y., Li, J., Fu, Y.: Deep alignment network based multi-person tracking with occlusion and motion reasoning. IEEE Trans. Multimed. 21(5), 1183–1194 (2018)
Article Google Scholar
Li, Y., Bozic, A., Zhang, T., Ji, Y., Harada, T., Nießner, M.: Learning to optimize non-rigid tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4910–4918 (2020)
Li, C., Dobler, G., Feng, X., Wang, Y.: Tracknet: simultaneous object detection and tracking and its application in traffic video analysis. arXiv preprint arXiv:1902.01466 (2019)
Zhu, H., Liu, H., Zhu, C., Deng, Z., Sun, X.: Learning spatial-temporal deformable networks for unconstrained face alignment and tracking in videos. Pattern Recognit. 107, 107354 (2020)
Article Google Scholar
Zhang, M., Wang, Q., Xing, J., Gao, J., Peng, P., Hu, W., et al.: Visual tracking via spatially aligned correlation filters network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 469–485 (2018)
Zhang, X., Lei, H., Ma, Y., Luo, S., Spatial, Fan X.: Tracking, transformer part-based siamese visual. In: 39th Chinese Control Conference (CCC), vol. 2020, pp. 7269–7274. IEEE (2020)
Qian, Y., Yang, M., Zhao, X., Wang, C., Wang, B.: Oriented spatial transformer network for pedestrian detection using fish-eye camera. IEEE Trans. Multimed. 22(2), 421–431 (2019)
Article Google Scholar
Luo, H., Jiang, W., Fan, X., Zhang, C.: Stnreid: deep convolutional networks with pairwise spatial transformer networks for partial person re-identification. IEEE Trans. Multimed. 22(11), 2905–2913 (2020)
Article Google Scholar
Li, D., Chen, X., Zhang, Z., Huang, K.: Learning deep context-aware features over body and latent parts for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 384–393 (2017)
Zhang, Y., Tang, Y., Fang, B., Shang, Z.: Multi-object tracking using deformable convolution networks with tracklets updating. Int. J. Wavelets Multiresolut. Inf. Process. 17(06), 1950042 (2019)
Article MathSciNet MATH Google Scholar
Wu, H., Xu, Z., Zhang, J., Jia, G.: Offset-adjustable deformable convolution and region proposal network for visual tracking. IEEE Access 7, 85158–85168 (2019)
Article Google Scholar
Cao, W.M., Chen, X.J.: Deformable convolutional networks tracker. In: DEStech Transactions on Computer Science and Engineering (iteee) (2019)
Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. arXiv preprint arXiv:1506.02025 (2015)
Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., et al.: Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 764–773 (2017)
Mumuni, A., Mumuni, F.: CNN architectures for geometric transformation-invariant feature representation in computer vision: a review. Manuscript accepted for publication, SN Computer Science 2(5), 1–23 (2021)
Wang, X., Shrivastava, A., Gupta, A.: A-fast-rcnn: hard positive generation via adversary for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2606–2615 (2017)
Lin, C.H., Yumer, E., Wang, O., Shechtman, E., Lucey, S.: St-gan: spatial transformer generative adversarial networks for image compositing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9455–9464 (2018)
Zhang, D., Zheng, Z., Wang, T., He, Y.: HROM: learning high-resolution representation and object-aware masks for visual object tracking. Sensors 20(17), 4807 (2020)
Article Google Scholar
Johnander, J., Danelljan, M., Khan, F.S., Felsberg, M.: DCCO: towards deformable continuous convolution operators for visual tracking. In: International Conference on Computer Analysis of Images and Patterns, pp. 55–67. Springer (2017)
Araujo, A., Norris, W., Sim, J.: Computing receptive fields of convolutional neural networks. Distill 4(11), e21 (2019)
Article Google Scholar
Chen, Z., Zhong, B., Li, G., Zhang, S., Ji, R.: Siamese box adaptive network for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6668–6677 (2020)
Jiang, X., Li, P., Zhen, X., Cao, X.: Model-free tracking with deep appearance and motion features integration. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 101–110. IEEE (2019)
Dequaire, J., Rao, D., Ondruska, P., Wang, D., Posner, I.: Deep tracking on the move: Learning to track the world from a moving vehicle using recurrent neural networks. arXiv preprint arXiv:1609.09365 (2016)
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122. (2015)
Li, Y., Zhang, X., Chen, D.: Csrnet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1091–1100 (2018)
Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., et al.: Understanding convolution for semantic segmentation. In: IEEE Winter Conference on Applications of Computer Vision (WACV), vol. 2018, pp. 1451–1460. IEEE (2018)
Zhang, Z., Peng, H., Fu, J., Li, B., Hu, W.: Ocean: Object-aware anchor-free tracking. arXiv preprint arXiv:2006.10721 (2020)
Weng, X., Wu, S., Beainy, F., Kitani, K.M.: Rotational rectification network: enabling pedestrian detection for mobile vision. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1084–1092. IEEE (2018)
Marcos, D., Volpi, M., Tuia, D.: Learning rotation invariant convolutional filters for texture classification. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2012–2017. IEEE (2016)
Jacobsen, J.H., De Brabandere, B., Smeulders, A.W.: Dynamic steerable blocks in deep residual networks. arXiv preprint arXiv:1706.00598 (2017)
Tarasiuk, P., Pryczek, M.: Geometric transformations embedded into convolutional neural networks. J. Appl. Comput. Sci. 24(3), 33–48 (2016)
Google Scholar
Henriques, J.F., Vedaldi, A.: Warped convolutions: Efficient invariance to spatial transformations. In: International conference on machine learning, pp. 1461–1469. PMLR (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Yang, L., Han, Y., Chen, X., Song, S., Dai, J., Huang, G.: Resolution adaptive networks for efficient inference. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2369–2378 (2020)
Tamura, M., Horiguchi, S., Murakami, T.: Omnidirectional pedestrian detection by rotation invariant training. In: IEEE winter conference on Applications of Computer Vision (WACV), vol. 2019, pp. 1989–1998. IEEE (2019)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Coors, B., Condurache, A.P., Geiger, A.: Spherenet: learning spherical representations for detection and classification in omnidirectional images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 518–533 (2018)
Rashed, H., Mohamed, E., Sistu, G., Kumar, V.R., Eising, C., El-Sallab, A., et al.: Generalized object detection on fisheye cameras for autonomous driving: Dataset, representations and baseline. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2272–2280 (2021)
Hao, Z., Liu, Y., Qin, H., Yan, J., Li, X., Hu, X.: Scale-aware face detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6186–6195 (2017)
Yang, Z., Xu, Y., Dai, W., Xiong, H.: Dynamic-stride-net: deep convolutional neural network with dynamic stride. In: Optoelectronic Imaging and Multimedia Technology VI, vol. 11187, p. 1118707. International Society for Optics and Photonics (2019)
Wen, L., Du, D., Cai, Z., Lei, Z., Chang, M.C., Qi, H., et al.: UA-DETRAC: a new benchmark and protocol for multi-object detection and tracking. Comput. Vis. Image Underst. 193, 102907 (2020)
Article Google Scholar
Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)
Article Google Scholar
Kristan, M., Matas, J., Leonardis, A., Felsberg, M., Cehovin, L., Fernandez, G., et al.: The visual object tracking vot2015 challenge results. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 1–23 (2015)
Kristan, M., Leonardis, A., Matas, J., Felsberg, M., Pflugfelder, R., Cehovin Zajc, L., et al.: The visual object tracking VOT2016 challenge results. In: Computer Vision—ECCV 2016 Workshops, pp. 777–823 (2016)
Kristan, M., Leonardis, A., Matas, J., Felsberg, M., Pflugfelder, R., Cehovin Zajc, L., et al.: The visual object tracking vot2017 challenge results. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 1949–1972 (2017)
Leal-Taixé, L., Milan, A., Reid, I., Roth, S., Schindler, K.: Motchallenge 2015: Towards a benchmark for multi-target tracking. arXiv preprint arXiv:1504.01942 (2015)
Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831 (2016)
Dendorfer, P., Rezatofighi, H., Milan, A., Shi, J., Cremers, D., Reid, I., et al.: Mot20: a benchmark for multi object tracking in crowded scenes. arXiv preprint arXiv:2003.09003 (2020)
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)
Article Google Scholar
Kiani Galoogahi, H., Fagg, A., Huang, C., Ramanan, D., Lucey, S.: Need for speed: a benchmark for higher frame rate object tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1125–1134 (2017)
Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for uav tracking. In: European Conference on Computer Vision, pp. 445–461. Springer (2016)
Liang, P., Blasch, E., Ling, H.: Encoding color information for visual tracking: algorithms and benchmark. IEEE Trans. Image Process. 24(12), 5630–5644 (2015)
Article MathSciNet MATH Google Scholar
Huang, L., Zhao, X., Huang, K.: Got-10k: a large high-diversity benchmark for generic object tracking in the wild. IEEE Trans. Pattern Anal. Mach. Intelligence. 43, 1562–1577 (2019)
Article Google Scholar
Fan, H., Lin, L., Yang, F., Chu, P., Deng, G., Yu, S., et al. Lasot: a high-quality benchmark for large-scale single object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5374–5383 (2019)
Muller, M., Bibi, A., Giancola, S., Alsubaihi, S., Ghanem, B.: Trackingnet: a large-scale dataset and benchmark for object tracking in the wild. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 300–317 (2018)
Du, D., Qi, Y., Yu, H., Yang, Y., Duan, K., Li, G., et al.: The unmanned aerial vehicle benchmark: object detection and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 370–386 (2018)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Real, E., Shlens, J., Mazzocchi, S., Pan, X., Vanhoucke, V.: Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5296–5305 (2017)
Wu, Y., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2411–2418 (2013)
Zhang, Z., Peng, H.: Deeper and wider siamese networks for real-time visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4591–4600 (2019)
Danelljan, M., Bhat, G., Shahbaz Khan, F., Felsberg, M.: Eco: efficient convolution operators for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6638–6646 (2017)
Yang, T., Chan, A.B.: Visual tracking via dynamic memory networks. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 360–374 (2019)
Google Scholar
Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., Yan, J.: Siamrpn++: evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4282–4291 (2019)
Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: Atom: accurate tracking by overlap maximization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4660–4669 (2019)
Bhat, G., Danelljan, M., Gool, L.V., Timofte, R.: Learning discriminative model prediction for tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6182–6191 (2019)
Lukezic, A., Matas, J., Kristan, M.: D3S-A discriminative single shot segmentation tracker. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7133–7142 (2020)
Xie, F., Yang, W., Zhang, K., Liu, B., Wang, G., Zuo, W.: Learning spatio-appearance memory network for high-performance visual tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2678–2687 (2021)
Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: Fairmot: on the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vis. 129, 3069–3087 (2021)
Article Google Scholar
Zheng, L., Tang, M., Chen, Y., Zhu, G., Wang, J., Lu, H.: Improving multiple object tracking with single object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2453–2462 (2021)
Zhu, J., Yang, H., Liu, N., Kim, M., Zhang, W., Yang, M.H.: Online multi-object tracking with dual matching attention networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 366–382 (2018)
Brasó, G., Leal-Taixé, L.: Learning a neural solver for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6247–6257 (2020)
Saleh, F., Aliakbarian, S., Rezatofighi, H., Salzmann, M., Gould, S.: Probabilistic tracklet scoring and inpainting for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14329–14339 (2021)
Zhang, Y., Sun, P., Jiang, Y., Yu, D., Yuan, Z., Luo, P., et al.: ByteTrack: multi-object tracking by associating every detection box. arXiv preprint arXiv:2110.06864 (2021)
Wang, Q., Zheng, Y., Pan, P., Xu, Y.: Multiple object tracking with correlation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3876–3886 (2021)
Liang, C., Zhang, Z., Zhou, X., Li, B., Lu, Y., Hu, W.: One more check: making “fake background” be tracked again. arXiv preprint arXiv:2104.09441 (2021)
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J. Image Video Process. 2008, 1–10 (2008)
Article Google Scholar
Wu, S., Xu, Y.: DSN: a new deformable subnetwork for object detection. IEEE Trans. Circuits Syst. Video Technol. 30(7), 2057–2066 (2019)
Google Scholar
Liu, Y., Duanmu, M., Huo, Z., Qi, H., Chen, Z., Li, L., et al.: Exploring multi-scale deformable context and channel-wise attention for salient object detection. Neurocomputing 428, 92–103 (2021)
Article Google Scholar
Lee, H., Choi, S., Kim, C.: A memory model based on the siamese network for long-term tracking. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018)
Fiaz, M., Mahmood, A., Jung, S.K.: Learning soft mask based feature fusion with channel and spatial attention for robust visual object tracking. Sensors 20(14), 4021 (2020)
Lee, D.J.L., Macke, S., Xin, D., Lee, A., Huang, S., Parameswaran, A.G.: A Human-in-the-loop Perspective on AutoML: milestones and the road ahead. IEEE Data Eng Bull. 42(2), 59–70 (2019)
Google Scholar
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016)
Kandasamy, K., Neiswanger, W., Schneider, J., Poczos, B., Xing, E.: Neural architecture search with bayesian optimisation and optimal transport. arXiv preprint arXiv:1802.07191 (2018)
Lu, Z., Whalen, I., Boddeti, V., Dhebar, Y., Deb, K., Goodman, E., et al.: Nsga-net: neural architecture search using multi-objective genetic algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 419–427 (2019)

Download references

Author information

Authors and Affiliations

Cape Coast Technical University, P. O. Box DL 50, Cape Coast, Ghana
Alhassan Mumuni
University of Mines and Technology, P. O. Box 237, Tarkwa, Ghana
Fuseini Mumuni

Authors

Alhassan Mumuni
View author publications
You can also search for this author inPubMed Google Scholar
Fuseini Mumuni
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Alhassan Mumuni.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mumuni, A., Mumuni, F. Robust appearance modeling for object detection and tracking: a survey of deep learning approaches. Prog Artif Intell 11, 279–313 (2022). https://doi.org/10.1007/s13748-022-00290-6

Download citation

Received: 25 May 2021
Accepted: 16 August 2022
Published: 06 September 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s13748-022-00290-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust appearance modeling for object detection and tracking: a survey of deep learning approaches

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals

Deep Learning of Appearance Models for Online Object Tracking

Object Tracking Using Deep Convolutional Neural Networks and Visual Appearance Models

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now