Realtime and robust object matching with a large number of templates

Hong, Chaoqun; Zhu, Jianke; Yu, Jun; Cheng, Jun; Chen, Xuhui

doi:10.1007/s11042-014-2305-7

Realtime and robust object matching with a large number of templates

Published: 19 November 2014

Volume 75, pages 1459–1480, (2016)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Chaoqun Hong¹,
Jianke Zhu²,
Jun Yu³,
Jun Cheng^4,5 &
…
Xuhui Chen¹

444 Accesses
10 Citations
Explore all metrics

Abstract

Most of conventional object matching methods are based on comparing local features, which are too computational demanding. Recently, Dominant Orientation Templates (DOT) were proposed to solve the efficiency issue. Although DOT obtains promising results, it still suffers the problem of wasting too many bits in representation and fragility when partial occlusion occurs. As the number of templates increase, the performance will decrease. Therefore, we propose a compact DOT representation with a fast partial occlusion handling approach. Instead of using seven orientations in the original implementation, we employ single orientation of the highest gradients for the proposed compact DOT representation (C-DOT). Consequently, the size of feature vectors is reduced from 8 bits to 3 bits. To efficiently tackle the partial occlusion, we introduce the C-DOT similarity map to store the matching scores of individual grids in each sliding window, which is used to further infer the occlusion map. The experimental results demonstrate that the proposed method outperforms DOT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A double circle structure descriptor and Hough voting matching for real-time object detection

Article 09 March 2016

Target Tracking Based Upon Dominant Orientation Template and Kalman Filter

SeFM: A Sequential Feature Point Matching Algorithm for Object 3D Reconstruction

References

Babenko B, Yang MH, Belongie S (2009) Visual tracking with online multiple instance learning. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 983–990
Bay H, Ess A, Tuytelaars T, Gool LV (2008) Surf: Speeded up robust features. Comp Vision Image Underst 110:346–359
Article Google Scholar
Bregonzio M, Gong S, Xiang T (2009) Recognising action as clouds of space-time interest points. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1948–1955
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition, vol 1. IEEE, pp 886–893
Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: Ideas, influences, and trends of the new age. ACM Comput Surv 40(2):1–60
Article Google Scholar
Gavrila D, Philomin V (1999) Real-time object detection for ”smart” vehicles. In:IEEE international conference on computer vision, vol 1. IEEE, pp 87–93
Guan N, Tao D, Luo Z, Yuan B (2012) Nenmf: An optimal gradient method for nonnegative matrix factorization. IEEE Trans Signal Proc 60(6):2882–2898
Article MathSciNet Google Scholar
Guan N, Tao D, Luo Z, Yuan B (2012) Online nonnegative matrix factorization with robust stochastic approximation. IEEE Trans Neural Networks Learn Syst 23(7):1087–1099
Article Google Scholar
Hajdu A, Pitas I (2007) Optimal approach for fast object-template matching. IEEE Trans Image Process 16(8):2048–2057
Article MathSciNet Google Scholar
Hinterstoisser S, Lepetit V, Ilic S, Fua P, Navab N (2010) Dominant orientation templates for real-time detection of texture-less objects. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 2257–2264
Hong Z, Mei X, Prokhorov D, Tao D (2013) Tracking via robust multi-task multi-view joint sparse representation. In: IEEE international conference on computer vision. IEEE, pp 1–8
Hong Z, Mei X, Tao D (2012) Dual-force metric learning for robust distracter-resistant tracker. In: European conference on computer vision. Springer, pp 513–527
Chapter Google Scholar
Ke Y, Sukthankar R (2004) Pca-sift: A more distinctive representation for local image descriptors. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 506–513
Kim HY (2010) Rotation-discriminating template matching based on fourier coefficients of radial projections with robustness to scaling and partial occlusion. Pattern Recog 43:105–119
MATH Google Scholar
Klaser A, Marszalek M, Schmid C (2008) A spatio-temporal descriptor based on 3d-gradients. In: British machine vision conference, pp 995–1004. PASCAL EPrints
Klimovitski A (2001) Using sse and sse2 : Misconceptions and reality. Intel developer update magazine:1–8
Lampert C (2010) An efficient divide-and-conquer cascade for nonlinear object detection. In:IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1022–1029
Lampert CH (2009) Detecting objects in large image collections and videos by efficient subimage retrieval.In: IEEE international conference on computer vision. IEEE
Lampert CH, Blaschko MB, Hofmann T (2009) Efficient subwindow search: A branch and bound framework for object localization. IEEE Trans Pattern Anal Mach Intell 31:2129–2142
Article Google Scholar
Li H, Tang J, Wu S, Zhang Y, Lin S (2010) Automatic detection and analysis of player action in moving background sports video sequences. IEEE Trans Circ Syst Video Technol 20(3):351–364
Article Google Scholar
Li H, Wang X, Tang J, Zhao C (2013) Combining global and local matching of multiple features for precise retrieval of item images. Multimedia Systems 19(1):37–49
Article Google Scholar
Li P, Wang M, Cheng J, Xu C, Lu H (2013) Spectral hashing with semantically consistent graph for image indexing. IEEE Trans Multimedia 15(1):141–152
Article Google Scholar
Li G, Wang M, Lu Z, Hong R, Cha T (2012) In-video product annotation with web information mining. ACM Trans Multimed Comput Commun Appl 8(4):1–55
Article Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
McFee B, Galleguillos C, Lanckriet G (2010) Contextual object localization with multiple kernel nearest neighbor. IEEE Trans Image Process 20(2):570
Article MathSciNet Google Scholar
Mezaris V, Kompatsiaris I, Boulgouris N, Strintzis M (2004) Real-time compressed-domain spatiotemporal segmentation and ontologies for video indexing and retrieval. IEEE Trans Circ Systems Video Technol 14(5):606–621
Article Google Scholar
Olson CF, Huttenlocher DP (1997) Automatic target recognition by matching oriented edge pixels. IEEE Trans Image Process 6 (1):103–113
Article Google Scholar
Ouyang W, Zhang R (2010) W.K.C.: Fast pattern matching using orthogonal haar transform. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 3050–3057
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1–8
Rosten E, Drummond T (2005) Fusing points and lines for high performance tracking. In:IEEE international conference on computer vision, vol 2. IEEE, pp 1508–C1511
Rosten E, Porter R, Drummond T (2010) Faster and better: A machine learning approach to corner detection. IEEE Trans Pattern Anal Mach Intell 32:105–119
Article Google Scholar
Santner J, Leistner C, Saffari A, Pock T, Bischof H (2010) Prost: Parallel robust online simple tracking. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 723–730
Sivic J, Zisserman A (2008) Efficient visual search of videos cast as text retrieval. IEEE Trans Pattern Anal Mach Intell 31:591–606
Article Google Scholar
Steger C (2002) Occlusion, clutter, and illumination invariant object recognition. In: International archives of photogrammetry and remote sensing
Takacs G, Chandrasekhar V, Tsai S, Chen D, Grzeszczuk R, Girod B (2010) Unified real-time tracking and recognition with rotation-invariant fast features. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 934–941
Taylor S, Rosten E, Drummond T (2009) Robust feature matching in 2.3 μs. In: IEEE computer society conference on computer vision and pattern recognition workshops. IEEE, pp 15–20
Wang X, Han TX (2009) Yan., S.: An hog-lbp human detector with partial occlusion handling. In: IEEE international conference on computer vision. IEEE, pp 32–39
Wang M, Gao Y, Lu K, Rui Y (2013) View-based discriminative probabilistic modeling for 3D object retrieval and recognition. IEEE Trans Image Process 22(4):1395–1407
Article MathSciNet Google Scholar
Wang M, Li H, Tao D, Lu K, Wu X (2012) Multimodal graph-based reranking for web image search. IEEE Trans Image Process 21(11):4649–4661
Article MathSciNet Google Scholar
Wang M, Ni B, Hua X, Chua T (2012) Assistive tagging: a survey of multimedia tagging with human-computer joint exploration. ACM Comput Surv 4(4). Article 25
Article Google Scholar
Wang M, Hua X, Hong R, Tang J, Qi G, Song Y (2009) Unified video annotation via multigraph learning. IEEE Trans Circ Syst Video Technol 19(5):733–746
Article Google Scholar
Wei Y, Tao L (2010) Efficient histogram-based sliding window. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 3003–3010
Willems G, Tuytelaars T, Gool LV (2008) An efficient dense and scale-invariant spatio-temporal interest point detector. In: European conference on computer vision, pp 650–663. LNCS
Google Scholar
Willems G, Tuytelaars T, Gool LV (2008) Spatio-temporal features for robust content-based video copy detection. In: ACM international conference on multimedia and information retrieval. ACM, pp 283–290
Wu C (2007) Siftgpu:A gpu implementation of scale invariant feature transform. http://cs.unc.edu/ccwu/siftgpu
Wu Z, Ke Q, Isard M, Sun J (2009) Bundling features for large scale partial-duplicate web image search. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 25–32
Zha ZJ, Wang M, Zheng YT, Yang Y, Hong R, Chua TS (2012) Interactive video indexing with statistical active learning. IEEE Trans Multimedia 14(1):17–27
Article Google Scholar
Zha ZJ, Yang L, Mei T, Wang M, Wang Z (2009) Visual query suggestion. In: the 17th ACM international conference on Multimedia. ACM, pp 15–24
Zha ZJ, Yang L, Mei T, Wang M, Wang Z, Chua TS, Hua XS (2010) Visual query suggestion: Towards capturing user intent in internet image search. ACM Transactions on Multimedia Computing. Commun Appl 6(3):1–19
Google Scholar
Zha ZJ, Zhang H, Wang M, Luan H, Chua TS (2013) Detecting group activities with multi-camera context. IEEE Trans Circ Syst Video Technol 23(5):856–869
Article Google Scholar
Zhang Z, Cao Y, Salvi D, Oliver K, Waggoner J, Wang S (2010) Free-shape subwindow search for object localization. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1086–1093
Zhou T, Tao D (2013) Shifted subspaces tracking on sparse outlier for motion segmentation. In: International joint conference on artificial intelligence. ACM, pp 1946–1952

Download references

Acknowledgments

This work is supported by the Natural Science Foundation of China (61472110, 61202145, 61100104 and 61065007), the Program for New Century Excellent Talents in University (No. NECT-12-0323) and the Natural Science Foundation of Fujian Province of China (2012J01287 and 2014J01256).

Author information

Authors and Affiliations

School of Computer and Information Engineering, Xiamen University of Technology, Xiamen, China
Chaoqun Hong & Xuhui Chen
College of Computer Science, Zhejiang University, Hangzhou, China
Jianke Zhu
School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, 310018, China
Jun Yu
Shenzhen Institues of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Jun Cheng
The Chinese University of HongKong, Shatin, HongKong, China
Jun Cheng

Authors

Chaoqun Hong
View author publications
You can also search for this author in PubMed Google Scholar
Jianke Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Jun Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jun Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Xuhui Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Yu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hong, C., Zhu, J., Yu, J. et al. Realtime and robust object matching with a large number of templates. Multimed Tools Appl 75, 1459–1480 (2016). https://doi.org/10.1007/s11042-014-2305-7

Download citation

Received: 08 November 2013
Revised: 09 September 2014
Accepted: 29 September 2014
Published: 19 November 2014
Issue Date: February 2016
DOI: https://doi.org/10.1007/s11042-014-2305-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Realtime and robust object matching with a large number of templates

Abstract

Access this article

Similar content being viewed by others

A double circle structure descriptor and Hough voting matching for real-time object detection

Target Tracking Based Upon Dominant Orientation Template and Kalman Filter

SeFM: A Sequential Feature Point Matching Algorithm for Object 3D Reconstruction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Realtime and robust object matching with a large number of templates

Abstract

Access this article

Similar content being viewed by others

A double circle structure descriptor and Hough voting matching for real-time object detection

Target Tracking Based Upon Dominant Orientation Template and Kalman Filter

SeFM: A Sequential Feature Point Matching Algorithm for Object 3D Reconstruction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation