Abstract
Loop closure detection (LCD) is a key step in visual simultaneous localization and mapping systems to correct the map and relocalize the vehicle. However, it may fail when illumination variations and shifting dynamics present in the scenes. One way to improve the precision is to effectively combine multiple image features that supply complementary information. In this paper, a general method to quantitatively measure the efficacy of individual image features as well as feature combinations for LCD is proposed by calculating the statistical distance considering the distributions of feature vectors. Based on different statistical distances including Kullback-Leibler divergence, Bhattacharyya divergence and Wasserstein metric, various numerical indices capable of evaluating feature combinations are obtained and compared. An unsupervised algorithm is further proposed to optimize feature combinations by maximizing any of the indices. Experiments show that the proposed indices can measure the efficacies of image features and the resulting feature combinations maximizing the indices can improve the precision of LCD.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Scaramuzza, D., Fraundorfer, F.: Visual odometry [tutorial]. IEEE Robot. Autom. Mag. 18 (4), 80–92 (2011)
Valgren, C., Lilienthal, A.J.: Sift, surf and seasons: Long-term outdoor localization using local features. In: European Conference on Mobile Robots (ECMR), pp. 253–258 (2007)
Liu, Y., Zhang, H.: Visual loop closure detection with a compact image descriptor. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1051–1056. IEEE (2012)
Campos, F.M., Correia, L., Calado, J.M.: Loop closure detection with a holistic image feature. In: Portuguese Conference on Artificial Intelligence, pp. 247–258. Springer (2013)
Sünderhauf, N., Protzel, P.: Brief-gist: Closing the loop by simple means. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1234–1241. IEEE (2011)
Arroyo, R., Alcantarilla, P.F., Bergasa, L.M., et al.: Bidirectional loop closure detection on panoramas for visual navigation. In: IEEE Intelligent Vehicles Symposium, pp. 1378–1383. IEEE (2014)
Wang, X., Zhang, H., Peng, G.: A chordiogram image descriptor using local edgels. J. Vis. Commun. Image R. 49, 129–140 (2017)
Sünderhauf, N., Shirazi, S., Dayoub, F., et al.: On the performance of convnet features for place recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4297–4304. IEEE (2015)
Wang, X., Peng, G., Zhang, H.: Combining multiple image descriptions for loop closure detection. J. Intell. Robot. Syst. 92(3), 565–585 (2018)
Cummins, M., Newman, P.: Probabilistic appearance based navigation and loop closing. In: 2007 IEEE International Conference on Robotics and Automation (ICRA), pp. 2042–2048. IEEE (2007)
Angeli, A., Doncieux, S., Meyer, J.A., Filliat, D.: Real-time visual loop-closure detection. In: 2008 IEEE International Conference on Robotics and Automation (ICRA), pp. 1842–1847. IEEE (2008)
Angeli, A., Filliat, D., Doncieux, S., Meyer, J.A.: Fast and incremental method for loop-closure detection using bags of visual words. IEEE Trans. Robot. 24(5), 1027–1037 (2008)
Cummins, M., Newman, P.: Fab-map: Probabilistic localization and mapping in the space of appearance. Int. J. Rob. Res. 27(6), 647–665 (2008)
Milford, M.J., Wyeth, G.F.: Seqslam: Visual route-based navigation for sunny summer days and stormy winter nights. In: 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 1643–1649. IEEE (2012)
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: Orb-slam: a versatile and accurate monocular slam system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)
Engel, J., Schöps, T., Cremers, D.: Lsd-slam: Large-scale direct monocular slam. In: European Conference on Computer Vision, pp. 834–849. Springer (2014)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60 (2), 91–110 (2004)
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Calonder, M., Lepetit, V., Ozuysal, M., et al.: Brief: Computing a local binary descriptor very fast. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1281–1298 (2012)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: An efficient alternative to sift or surf. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2564–2571. IEEE (2011)
Toshev, A., Taskar, B., Daniilidis, K.: Shape-based object detection via boundary structure segmentation. Int. J. Comput. Vis. 99(2), 123–146 (2012)
Perronnin, F., Dance, C.R.: Fisher kernels on visual vocabularies for image categorization. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE (2007)
Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311 (2010)
Arandjelovic, R., Zisserman, A.: All about vlad. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1578–1585 (2013)
Hou, Y., Zhang, H., Zhou, S.: Convolutional neural network-based image representation for visual loop closure detection. In: 2015 IEEE International Conference on Information and Automation, pp. 2238–2245. IEEE (2015)
Li, Q., Li, K., You, X., et al.: Place recognition based on deep feature and adaptive weighting of similarity matrix. Neurocomputing 199, 114–127 (2016)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Gu, Q., Li, Z., Han, J.: Generalized fisher score for feature selection. In: Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, UAI’11, pp 266–273. AUAI Press, Arlington (2011)
Loog, M., Duin, R.P.W., Haeb-Umbach, R.: Multiclass linear dimension reduction by weighted pairwise fisher criteria. IEEE Trans. Pattern Anal. Mach. Intell. 23(7), 762–766 (2001)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J Mach. Learn. Res. 3(Mar), 1157–1182 (2003)
Kuncheva, L.I.: Combining pattern classifiers: Methods and algorithms. Wiley (2004)
Kittler, J., Hatef, M., Duin, R.P., Matas, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 226–239 (1998)
Campos, F.M., Correia, L., Calado, J.M.: Robot visual localization through local feature fusion: An evaluation of multiple classifiers combination approaches. J. Intell. Robot. Syst. 77(2), 377–390 (2015)
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Statist. 22(1), 79–86 (1951)
Bhattacharyya, A.: On a measure of divergence between two statistical populations defined by their probability distribution. Bull. Calcutta Math. Soc. 35, 99–109 (1943)
Van Erven, T., Harremos, P.: Rényi divergence and kullback-leibler divergence. IEEE Trans. Inf. Theory 60(7), 3797–3820 (2014)
Gil, M., Alajaji, F., Linder, T.: Rényi divergence measures for commonly used univariate continuous distributions. Inf. Sci. 249, 124–131 (2013)
Crysandt, H.: Linear feature vector compression using Kullback-Leibler distance. In: 2006 IEEE International Symposium on Signal Processing and Information Technology, pp. 556–561. IEEE (2006)
Arandjelović, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2911–2918. IEEE (2012)
Choi, E., Lee, C.: Feature extraction based on the Bhattacharyya distance. Pattern Recognit. 36(8), 1703–1709 (2003)
Levina, E., Bickel, P.: The earth mover’s distance is the mallows distance: Some insights from statistics. In: 2001 IEEE International Conference on Computer Vision (ICCV), vol. 2, pp. 251–256. IEEE (2001)
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)
Qiu, K., Ai, Y., Tian, B., et al.: Siamese-resnet: Implementing loop closure detection based on siamese network. In: IEEE Intelligent Vehicles Symposium, pp. 716–721 (2018)
Chen, Z., Liu, L., Sa, I., et al.: Learning context flexible attention model for long-term visual place recognition. IEEE Robot. Autom. Lett. 3(4), 4015–4022 (2018)
Zhang, H., Han, F., Wang, H.: Robust multimodal sequence-based loop closure detection via structured sparsity. In: Proceedings of Robotics: Science and Systems. AnnArbor (2016)
Ho, K.L., Newman, P.: Detecting loop closure with scene sequences. Int. J. Comput. Vis. 74 (3), 261–286 (2007)
Corporation, T.N.B.: The nordlandsbanen dataset. http://nrkbeta.no/2013/01/15/nordlandsbanen-minute-by-minute-season-by-season/http://nrkbeta.no/2013/01/15/nordlandsbanen-minute-by-minute-season-by-season/ (2013)
Glover, A.J., Maddern, W.P., Milford, M.J., Wyeth, G.F.: Fab-map+ratslam: Appearance-based slam for multiple times of day. In: 2010 IEEE International Conference on Robotics and Automation (ICRA), pp. 3507–3512. IEEE (2010)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Paszke, A., Gross, S., Massa, F., et al.: Pytorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc (2019)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 2169–2178. IEEE (2006)
Davis, J., Goadrich, M.: The relationship between precision-recall and roc curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM (2006)
Funding
This work was supported by the Fundamental Research Funds for the Central Universities (No. GK202103008).
Author information
Authors and Affiliations
Contributions
Xiaolong Wang did the research and made the manuscript. Hong Zhang and Guohua Peng are his supervisors, providing suggestions and assistances in this study.
Corresponding author
Ethics declarations
Conflicts of interest/Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix:
Appendix:
1.1 A.1 Proof of the Assertions in Section ??
Suppose that there are two features h = [h1,h2]T with the statistics
where s1 and s2 are positive and the correlation coefficient ρ < 1.
-
1.
Suppose that D[h2] > D[h1], then
$$ D[h_{2}]=\frac{{s_{2}^{2}}}{{\sigma_{2}^{2}}}>\frac{{s_{1}^{2}}}{{\sigma_{1}^{2}}}=D[h_{1}] $$(2)and equivalently, s2/s1 > σ2/σ1. Therefore
$$ \left( 1+\frac{s_{2}}{s_{1}}\right)^{2}>\left( 1+\frac{\sigma_{2}}{\sigma_{1}}\right)^{2}>1+\left( \frac{\sigma_{2}}{\sigma_{1}}\right)^{2}+2\rho\frac{\sigma_{2}}{\sigma_{1}} $$(3)and equivalently,
$$ D[h_{1}+h_{2}]=\frac{(s_{1}+s_{2})^{2}}{{\sigma_{1}^{2}}+{\sigma_{2}^{2}}+2\rho\sigma_{1}\sigma_{2}}>\frac{{s_{1}^{2}}}{{\sigma_{1}^{2}}}=D[h_{1}]. $$(4) -
2.
The optimal weight vector maximizing (??) is [9]
$$ \boldsymbol w^{*}=\left[ \begin{array}{c} w_{1}^{*} \\ w_{2}^{*} \end{array}\right] \propto \boldsymbol{\varSigma}^{-1}\boldsymbol s\propto \left[ \begin{array}{c} d_{1}-\rho d_{2} \\ \lambda d_{2} -\rho\lambda d_{1} \end{array}\right] $$(5)where \(d_{1}= \sqrt {D[h_{1}]}\), \(d_{2}=\sqrt {D[h_{2}]}\) and λ = σ1/σ2. We define the ratio
$$ k = \frac{w_{1}^{*}}{w_{1}^{*}+w_{2}^{*}}=\frac{d_{1}-\rho d_{2}}{d_{1}+rd_{2}-\rho(rd_{1}+d_{2})}. $$(6)A direct calculation shows that the sign of ∂k/∂D[h1] and ∂k/∂ρ are the same as 1 − ρ2 and D[h1] − D[h2], respectively. Therefore ∂k/∂D[h1] > 0 always holds and ∂k/∂ρ > 0 if and only if D[h1] > D[h2].
1.2 A.2 Jacobian of FEIs
The Jacobian of D(w) in Eq. ?? is
This type of FEIs includes DFS, DLR, the first part g1 of Dα according to Eq. ?? and the first part of DWS according to Eq. ??.
The Jacobian of the second part g2 of Dα(w) can be deduced from Eq. ??,
For ∥w∥ = 1, the Jacobian of the second part of DWS(w) can be deduced from Eq. ??,
Rights and permissions
About this article
Cite this article
Wang, X., Zhang, H. & Peng, G. Evaluating and Optimizing Feature Combinations for Visual Loop Closure Detection. J Intell Robot Syst 104, 31 (2022). https://doi.org/10.1007/s10846-022-01575-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-022-01575-7