Abstract
Template matching techniques are often used for camera tracking. They provide a good balance between computational cost and robustness to illumination changes. However, they lack robustness to camera orientation and scale changes. Camera movement, and specially rotation, generates perspective deformations that affect the process of patch matching, so the number of inliers (3D–2D correspondences) decreases. This fact affects camera tracking stability. This paper provides the following statistical proof: considering surface normals associated with 3D points substantially increases the number of inliers. So, this paper shows that computing perspective compensation improves the tracking. For instance, in a particular camera path used for experiments in this paper, without compensation, only a \(14\%\) of 3D points projected into the image were found as inliers, while perspective compensation increased that figure up to a \(65\%\). These results must be contextualized in the analysis provided by the paper.
Similar content being viewed by others
References
Alahi, A., Ortiz, R., Vandergheynst, P.: Freak: fast retina keypoint. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 510–517 (2012)
Barrena, N., Sánchez, J.R., García-Alonso, A.: WaPT surface normal estimation for improved template matching in visual tracking. In: Proceedings of the 10th International Conference on Computer Vision Theory and Applications (VISAPP), vol. 3, pp. 496–503. SCITEPRESS (2015)
Barron, J.L., Fleet, D.J., Beauchemin, S.S.: Performance of optical flow techniques. Int. J. Comput. Vis. 12(1), 43–77 (1994)
Bay, H., Tuytelaars, T., Van Gool, L.: Surf: speeded up robust features. In: Computer Vision–ECCV, pp. 404–417. Springer, New York (2006)
Bellavia, F., Tegolo, D., Valenti, C.: Keypoint descriptor matching with context-based orientation estimation. Image Vis. Comput. 32(9), 559–567 (2014)
Bouguet, J.Y.: Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm. Intel Corp. 5(1–10), 4 (2001)
Brunelli, R.: Template Matching Techniques in Computer Vision: Theory and Practice. Wiley, New York (2009)
Charmette, B., Royer, E., Chausse, F.: Vision-based robot localization based on the efficient matching of planar features. Mach. Vis. Appl. 27(4), 415–436 (2016)
Davison, A.J.: Real-time simultaneous localisation and mapping with a single camera. In: Ninth IEEE International Conference on Computer Vision, pp. 1403–1410. IEEE (2003)
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: Monoslam: real-time single camera slam. IEEE Trans. Pattern Anal. Mach. Intell. 29, 1052–1067 (2007)
Furukawa, Y., Ponce, J.: Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 32(8), 1362–1376 (2010)
Goesele, M., Snavely, N., Curless, B., Hoppe, H., Seitz, S.M.: Multi-view stereo for community photo collections. In: IEEE 11th International Conference on Computer Vision, ICCV, pp. 1–8. IEEE (2007)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)
Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: 6th IEEE and ACM International Symposium on Mixed and Augmented Reality. ISMAR, pp. 225–234. IEEE (2007)
Koenderink, J.J., Van Doorn, A.J.: Affine structure from motion. J. Opt. Soc. Am. 8(2), 377–385 (2007)
Lepetit, V., Fua, P., et al.: Monocular model-based 3d tracking of rigid objects: a survey. Found. Trends Comput. Gr. Vis. 1(1), 1–89 (2005)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Molton, N., Davison, A.J., Reid, I.: Locally planar patch features for real-time structure from motion. In: BMVC, pp. 1–10 (2004)
Mostofi, N., Moussa, A., Elhabiby, M., El-Sheimy, N.: RGB-D indoor plane-based 3d-modeling using autonomous robot. Int. Arch. Photogram. Remote Sens. Spat. Inf. Sci. 40(1), 301 (2014)
Plackett, R.L.: Karl Pearson and the chi-squared test. Int. Stat. Rev. 51, 59–72 (1983)
Rosner, B., Grove, D.: Use of the mann-whitney u-test for clustered data. Stat. Med. 18(11), 1387–1400 (1999)
Rosten, E., Drummond, T.: Machine learning for high-speed corner detection. In: Computer Vision–ECCV 2006, pp. 430–443. Springer, New York (2006)
Schmidt, A.: Prediction-based perspective warping of feature template for improved visual slam accuracy. In: Man–Machine Interactions, vol. 4, pp. 169–177. Springer, New York (2016)
Sivaraman, S., Trivedi, M.M.: Looking at vehicles on the road: a survey of vision-based vehicle detection, tracking, and behavior analysis. IEEE Trans. Intell. Transp. Syst. 14(4), 1773–1795 (2013)
Szeliski, R.: Computer Vision: Algorithms and Applications. Springer, New York (2010)
Watman, C., Austin, D., Barnes, N., Overett, G., Thompson, S.: Fast sum of absolute differences visual landmark detector. In: Robotics and Automation, 2004. Proceedings. ICRA’04. 2004 IEEE International Conference, vol. 5, pp. 4827–4832. IEEE (2004)
Wu, C., Clipp, B., Li, X., Frahm, J.M., Pollefeys, M.: 3d model matching with viewpoint-invariant patches (VIP). In: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference, pp. 1–8. IEEE (2008)
Acknowledgements
The authors wish to thank to David Oyarzun, former head of the Interactive and Computer Graphics department of Vicomtech-Ik4, for his expert advise.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Barrena, N., Sánchez, J.R., Ugarte, R.J. et al. Proving the efficiency of template matching-based markerless tracking methods which consider the camera perspective deformations. Machine Vision and Applications 29, 573–584 (2018). https://doi.org/10.1007/s00138-018-0914-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-018-0914-2