Centroid-based graph matching networks for planar object tracking

Li, Kunpeng; Liu, He; Wang, Tao

doi:10.1007/s00138-023-01382-6

Centroid-based graph matching networks for planar object tracking

Original Paper
Published: 02 March 2023

Volume 34, article number 31, (2023)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Kunpeng Li¹^na1,
He Liu¹^na1 &
Tao Wang¹

385 Accesses
2 Citations
2 Altmetric
Explore all metrics

Abstract

Recently, keypoint-based methods have received more attention on planar object tracking due to their abilities to deal with partial noises, such as occlusion and out-of-view. However, robust tracking is still a tricky problem in the case of fast movement, large transformation and motion blur. The key reason is that there are not enough matching inliers to reconstruct the homography in the presence of such perturbations. To this end, we propose a novel centroid-based graph matching networks (CGN), which consists of two components: centroid localization network (CLN) and graph matching network (GMN). In detail, the CLN reduces the search range of the tracker from the entire image to the target region by locating the centroid of the target. The CLN gives the initial guess of the position, which guarantees the proportion of inliers matching the template. Then, the keypoints in the template and the target region are modeled as two graphs connected by cross-edges, and their correspondences are established by the GMN. The GMN overcomes the impact of large transformation by exploiting the stability of the graph structure. Finally, the transformation from the template to the current frame is estimated from the matched keypoint pairs by the RANSAC algorithm. In addition, the number of labeled points in previous datasets for training matching models is too small to cope with complex transformations, so we synthesize a large-scale dataset with labels to train the GMN. Experimental results on POT-210, POIC and TMT datasets show that our proposed method outperforms the state-of-the-art baseline methods in general, with significant improvements on fast movement and motion blur.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-Speed Multi-person Tracking Method Using Bipartite Matching

Indoor Loop Closure Detection Based on Semantic Topology Graph Matching

Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement

References

Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. Comput. Vis. Pattern Recognit. (2018). https://doi.org/10.15607/RSS.2018.XIV.019
Article Google Scholar
Saputra, M.R.U., Markham, A., Trigoni, N.: Visual SLAM and structure from motion in dynamic environments: a survey. ACM Comput. Surv. 51(2), 37–13736 (2018). https://doi.org/10.1145/3177853
Article Google Scholar
Baker, S., Matthews, I.A.: Lucas-kanade 20 years on: A unifying framework. Int. J. Comput. Vis. 56(3), 221–255 (2004). https://doi.org/10.1023/B:VISI.0000011205.11775.fd
Article MATH Google Scholar
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Int. Joint Conf. Artif. Intell., pp. 674– 679 ( 1981)
Richa, R., Sznitman, R., Taylor, R., Hager, G.: Visual tracking using the sum of conditional variance. In: IEEE Int. Conf. Intell. Rob. Syst., pp. 2953– 2958 ( 2011). https://doi.org/10.1109/IROS.2011.6094650
Malis, E.: Improving vision-based control using efficient second-order minimization techniques. In: IEEE Int. Conf. Robot. Autom 2, 1843–1848 (2004). https://doi.org/10.1109/ROBOT.2004.1308092
Article Google Scholar
Chen, L., Zhou, F., Shen, Y., Tian, X., Ling, H., Chen, Y.: Illumination insensitive efficient second-order minimization for planar object tracking. In: Proc. IEEE Int. Conf. Robot. Autom., pp. 4429– 4436 ( 2017). https://doi.org/10.1109/ICRA.2017.7989512
Chen, L., Ling, H., Shen, Y., Zhou, F., Wang, P., Tian, X., Chen, Y.: Robust visual tracking for planar objects using gradient orientation pyramid. J. Electron. Imaging 28(1), 013007 (2019). https://doi.org/10.1117/1.JEI.28.1.013007
Article Google Scholar
Wang, T., Ling, H., Lang, C., Feng, S., Jin, Y., Li, Y.: Constrained confidence matching for planar object tracking. In: Proc. IEEE Int. Conf. Robot. Autom., pp. 659– 666 ( 2018). https://doi.org/10.1109/ICRA.2018.8460680
Yi, S., Liu, W.: Multiscale salient region-based visual tracking. Mach. Vis. Appl. 28(3–4), 327–339 (2017). https://doi.org/10.1007/s00138-017-0836-4
Article Google Scholar
Liu, S., Xu, X., Zhang, Y., Muhammad, K., Fu, W.: A reliable sample selection strategy for weakly supervised visual tracking. IEEE Trans. Reliab. (2022). https://doi.org/10.1109/TR.2022.3162346
Article Google Scholar
Liu, S., Wang, S., Liu, X., Lin, C., Lv, Z.: Fuzzy detection aided real-time and robust visual tracking under complex environments. IEEE Trans. Fuzzy Syst. 29(1), 90–102 (2021). https://doi.org/10.1109/TFUZZ.2020.3006520
Article Google Scholar
Liu, S., Wang, S., Liu, X., Dai, J., Muhammad, K., Gandomi, A.H., Ding, W., Hijji, M., de Albuquerque, V.H.C.: Human inertial thinking strategy: a novel fuzzy reasoning mechanism for IOT-assisted visual monitoring. IEEE Internet Things J. (2022). https://doi.org/10.1109/JIOT.2022.3142115
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94
Article Google Scholar
Bay, H., Tuytelaars, T., Gool, L.V.: Surf: Speeded up robust features. In: European Conference on Computer Vision, pp. 404– 417 ( 2006). https://doi.org/10.1007/11744023_32
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: Binary robust independent elementary features. In: Proc. Eur. Conf. Comput. Vis., pp. 778– 792 ( 2010). https://doi.org/10.1007/978-3-642-15561-1_56
Babenko, B., Yang, M.-H., Belongie, S.: Robust object tracking with online multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1619–1632 (2010). https://doi.org/10.1109/TPAMI.2010.226
Article Google Scholar
Grabner, H., Bischof, H.: On-line boosting and vision. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog 1, 260–267 (2006). https://doi.org/10.1109/CVPR.2006.215
Hare, S., Saffari, A., Torr, P.H.S.: Efficient online structured output learning for keypoint-based object tracking. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 1894– 1901 ( 2012). https://doi.org/10.1109/CVPR.2012.6247889
Wang, T., Ling, H.: Gracker: A graph-based planar object tracker. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1494–1501 (2018). https://doi.org/10.1109/TPAMI.2017.2716350
Article Google Scholar
Yunus, R., Li, Y., Tombari, F.: Manhattanslam: Robust planar tracking and mapping leveraging mixture of manhattan frames. In: Proc. IEEE Int. Conf. Robot. Autom., pp. 6687– 6693 ( 2021). https://doi.org/10.1109/ICRA48506.2021.9562030
DeTone, D., Malisiewicz, T., Rabinovich, A.: Deep image homography estimation. arXiv preprint arXiv:1606.03798 (2016)
Japkowicz, N., Nowruzi, F.E., Laganière, R.: Homography estimation from image pairs with hierarchical convolutional networks. In: Proc. IEEE Int. Conf. Comput. Vis. Workshop, pp. 904– 911 ( 2017). https://doi.org/10.1109/ICCVW.2017.111
Wang, X., Wang, C., Bai, X., Liu, Y., Zhou, J.: Deep homography estimation with pairwise invertibility constraint. In: Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), 11004, 204–214 (2018). https://doi.org/10.1007/978-3-319-97785-0_20
DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog. Workshops, pp. 224– 236 ( 2018). https://doi.org/10.1109/CVPRW.2018.00060
Liu, Y., Shen, Z., Lin, Z., Peng, S., Bao, H., Zhou, X.: GIFT: learning transformation-invariant dense visual descriptors via group cnns. In: Proc. Adv. Neural Inf. Process. Syst., pp. 6990– 7001 ( 2019)
Pautrat, R., Larsson, V., Oswald, M.R., Pollefeys, M.: Online invariance selection for local feature descriptors. In: Proc. Eur. Conf. Comput. Vis., pp. 707– 724 ( 2020). https://doi.org/10.1007/978-3-030-58536-5_42
Sarlin, P., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: Learning feature matching with graph neural networks. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 4938– 4947 ( 2020). https://doi.org/10.1109/CVPR42600.2020.00499
Zhan, X., Liu, Y., Zhu, J., Li, Y.: Homography decomposition networks for planar object tracking. In: AAAI Conf. Artif. Intell., pp. 3234– 3242 ( 2022)
Yan, B., Peng, H., Fu, J., Wang, D., Lu, H.: Learning spatio-temporal transformer for visual tracking. In: Proc. IEEE Int. Conf. Comput. Vis., pp. 10448– 10457 ( 2021).https://doi.org/10.1109/ICCV48922.2021.01028
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981). https://doi.org/10.1145/358669.358692
Article MathSciNet Google Scholar
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: Proc. Eur. Conf. Comput. Vis., pp. 740– 755 ( 2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liang, P., Wu, Y., Lu, H., Wang, L., Liao, C., Ling, H.: Planar object tracking in the wild: A benchmark. In: Proc. IEEE Int. Conf. Robot. Autom., pp. 651– 658 ( 2018). https://doi.org/10.1109/ICRA.2018.8461037
Roy, A., Zhang, X., Wolleb, N., Quintero, C.P., Jägersand, M.: Tracking benchmark and evaluation for manipulation tasks. In: Proc. IEEE Int. Conf. Robot. Autom., pp. 2448– 2453 ( 2015). https://doi.org/10.1109/ICRA.2015.7139526
Liang, P., Ji, H., Wu, Y., Chai, Y., Wang, L., Liao, C., Ling, H.: Planar object tracking benchmark in the wild. Neurocomputing 454, 254–267 (2021). https://doi.org/10.1016/j.neucom.2021.05.030
Article Google Scholar
Sarlin, P., Cadena, C., Siegwart, R., Dymczyk, M.: From coarse to fine: Robust hierarchical localization at large scale. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 12716– 12725 ( 2019). https://doi.org/10.1109/CVPR.2019.01300
Liu, D., Cui, Y., Yan, L., Mousas, C., Yang, B., Chen, Y.V.: Densernet: Weakly supervised visual localization using multi-scale feature aggregation. In: AAAI Conf. Artif. Intell., pp. 6101– 6109 ( 2021)
Yan, L., Cui, Y., Chen, Y.V., Liu, D.: Hierarchical attention fusion for geo-localization. In: IEEE Int. Conf. Acoust. Speech Signal Process., pp. 2220– 2224 ( 2021). https://doi.org/10.1109/ICASSP39728.2021.9414517
Zhu, S., Shah, M., Chen, C.: Transgeo: Transformer is all you need for cross-view image geo-localization. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 1152– 1161 ( 2022). https://doi.org/10.1109/CVPR52688.2022.00123
Li, X., Han, K., Li, S., Prisacariu, V.: Dual-resolution correspondence networks. In: Adv. Neural Inf. Process. Syst. ( 2020)
Rocco, I., Arandjelovic, R., Sivic, J.: Efficient neighbourhood consensus networks via submanifold sparse convolutions. In: Eur. Conf. Comput. Vis 12354, 605–621 (2020). https://doi.org/10.1007/978-3-030-58545-7_35
Article Google Scholar
Jiang, B., Sun, P., Luo, B.: Glmnet: graph learning-matching convolutional networks for feature matching. Pattern Recognit. 121, 108167 (2022). https://doi.org/10.1016/j.patcog.2021.108167
Article Google Scholar
Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X.: Loftr: Detector-free local feature matching with transformers. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 8922– 8931 ( 2021). https://doi.org/10.1109/CVPR46437.2021.00881
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 770– 778 ( 2016). https://doi.org/10.1109/CVPR.2016.90
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Proc. Eur. Conf. Comput. Vis 12346, 213–229 (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Article Google Scholar
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: Keypoint triplets for object detection. In: Proc. IEEE Int. Conf. Comput. Vis., pp. 6568– 6577 ( 2019). https://doi.org/10.1109/ICCV.2019.00667
Lee, D.-T., Schachter, B.J.: Two algorithms for constructing a delaunay triangulation. Int. J. Comput. Inf. Sci. 9(3), 219–242 (1980). https://doi.org/10.1007/BF00977785
Article MathSciNet MATH Google Scholar
Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl, G.E.: Neural message passing for quantum chemistry. In: Proc. Int. Conf. Mach. Learn., pp. 1263– 1272 ( 2017)
Huang, L., Zhao, X., Huang, K.: Got-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE Trans. Pattern Anal. Mach. Intell. 43(5), 1562–1577 (2021). https://doi.org/10.1109/TPAMI.2019.2957464
Article Google Scholar
Fan, H., Lin, L., Yang, F., Chu, P., Deng, G., Yu, S., Bai, H., Xu, Y., Liao, C., Ling, H.: Lasot: A high-quality benchmark for large-scale single object tracking. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 5374– 5383 ( 2019). https://doi.org/10.1109/CVPR.2019.00552
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: Proc. Int. Conf. Learn. Represent. ( 2019)
Ozuysal, M., Fua, P., Lepetit, V.: Fast keypoint recognition in ten lines of code. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 1– 8 ( 2007). https://doi.org/10.1109/CVPR.2007.383123
Chen, X., Yan, B., Zhu, J., Wang, D., Yang, X., Lu, H.: Transformer tracking. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 8126– 8135 ( 2021)
Zhang, Z., Peng, H., Fu, J., Li, B., Hu, W.: Ocean: Object-aware anchor-free tracking. In: Proc. Eur. Conf. Comput. Vis., pp. 771– 787 ( 2020). https://doi.org/10.1007/978-3-030-58589-1_46

Download references

Author information

Kunpeng Li and He Liu have contributed equally to this work.

Authors and Affiliations

School of Computer and Information Technology, Beijing Jiaotong University, Beijing, 100044, China
Kunpeng Li, He Liu & Tao Wang

Authors

Kunpeng Li
View author publications
You can also search for this author in PubMed Google Scholar
He Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, K., Liu, H. & Wang, T. Centroid-based graph matching networks for planar object tracking. Machine Vision and Applications 34, 31 (2023). https://doi.org/10.1007/s00138-023-01382-6

Download citation

Received: 28 June 2022
Revised: 02 January 2023
Accepted: 01 February 2023
Published: 02 March 2023
DOI: https://doi.org/10.1007/s00138-023-01382-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Centroid-based graph matching networks for planar object tracking

Abstract

Access this article

Similar content being viewed by others

High-Speed Multi-person Tracking Method Using Bipartite Matching

Indoor Loop Closure Detection Based on Semantic Topology Graph Matching

Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Centroid-based graph matching networks for planar object tracking

Abstract

Access this article

Similar content being viewed by others

High-Speed Multi-person Tracking Method Using Bipartite Matching

Indoor Loop Closure Detection Based on Semantic Topology Graph Matching

Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation