Abstract
We present the first attempt in creating a binary 3D feature descriptor for fast and efficient keypoint matching on 3D point clouds. Specifically, we propose a binarization technique and apply it on the state-of-the-art 3D feature descriptor, SHOT (Salti et al., Comput Vision Image Underst 125:251–264, 2014) to create the first binary 3D feature descriptor, which we call B-SHOT. B-SHOT requires 32 times lesser memory for its representation while being six times faster in feature descriptor matching, when compared to the SHOT feature descriptor. Next, we propose a robust evaluation metric, specifically for 3D feature descriptors. A comprehensive evaluation on standard benchmarks reveals that B-SHOT offers comparable keypoint matching performance to that of the state-of-the-art real valued 3D feature descriptors, albeit at dramatically lower computational and memory costs.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10514-016-9612-y/MediaObjects/10514_2016_9612_Fig1_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10514-016-9612-y/MediaObjects/10514_2016_9612_Fig2_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10514-016-9612-y/MediaObjects/10514_2016_9612_Fig3_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10514-016-9612-y/MediaObjects/10514_2016_9612_Fig4_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10514-016-9612-y/MediaObjects/10514_2016_9612_Fig5_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10514-016-9612-y/MediaObjects/10514_2016_9612_Fig6_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10514-016-9612-y/MediaObjects/10514_2016_9612_Fig7_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10514-016-9612-y/MediaObjects/10514_2016_9612_Fig8_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10514-016-9612-y/MediaObjects/10514_2016_9612_Fig9_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10514-016-9612-y/MediaObjects/10514_2016_9612_Fig10_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10514-016-9612-y/MediaObjects/10514_2016_9612_Fig11_HTML.gif)
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Tombari et al. (2013) presented a comprehensive survey and performance evaluation of various 3D keypoint detectors.
We employ the default implementation of SHOT feature descriptor available through Point Cloud Library at www.pointclouds.org.
The way we added the extra information about the relative largeness and the experimental results are available at http://tinyurl.com/eb-shot.
State-of-the-art 3D keypoint detectors achieve at most 0.5 relative repeatability (Tombari et al. 2013), i.e., only half of the detected keypoints between a scene and a model lie exactly at the same positions.
This can also be seen from Fig. 9 of Salti et al. (2014).
3D Object Recognition based on Correspondence Grouping http://pointclouds.org/documentation/tutorials/correspondence_grouping.php.
We employ pcl::registration::CorrespondenceEstimation class from Point Cloud Library (www.pointclouds.org) to estimate reciprocal correspondences, which inherently uses a kdtree for faster matching and retrieval.
References
Alahi, A., Ortiz, R., & Vandergheynst, P. (2012). FREAK: Fast retina keypoint. In 2012 IEEE conference on computer vision and pattern recognition (CVPR).
Albarelli, A., Rodola, E., & Torsello, A. (2010). A game-theoretic approach to fine surface registration without initial motion estimation. In 2010 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 430–437). IEEE.
Albarelli, A., Rodola, E., & Torsello, A. (2010). Loosely distinctive features for robust surface alignment. In Computer vision—ECCV 2010 (pp. 519–532). Springer.
Aldoma, A., Marton, Z. C., Tombari, F., Wohlkinger, W., Potthast, C., Zeisl, B., et al. (2012). Point cloud library: Three-dimensional object recognition and 6 DoF Pose Estimation. IEEE Robotics & Automation Magazine, 1070(9932/12).
Aldoma, A., Tombari, F., Di Stefano, L., & Vincze, M. (2015). A global hypothesis verification framework for 3D object recognition in clutter. IEEE Transactions on Pattern Analysis and Machine Intelligence, PP(99), 1–1. doi:10.1109/TPAMI.2015.2491940.
Besl, P. J., & McKay, H. D. (1992). A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 239–256.
Calonder, M., Lepetit, V., Ozuysal, M., Trzcinski, T., Strecha, C., & Fua, P. (2012). BRIEF: Computing a local binary descriptor very fast. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(7), 1281–1298. doi:10.1109/TPAMI.2011.222.
Calonder, M., Lepetit, V., Strecha, C., & Fua, P. (2010). Brief: Binary robust independent elementary features. In Computer vision—ECCV 2010 (pp. 778–792). Springer.
Chen, H., & Bhanu, B. (2007). 3D free-form object recognition in range images using local surface patches. Pattern Recognition Letters, 28(10), 1252–1262.
Choi, S., Zhou, Q. Y., & Koltun, V. (2015). Robust reconstruction of indoor scenes. In IEEE conference on computer vision and pattern recognition (CVPR).
Choi, S., Zhou, Q. Y., Miller, S., & Koltun, V. (2016). A large dataset of object scans. arXiv:1602.02481
Chua, C. S., & Jarvis, R. (1997). Point signatures: A new representation for 3D object recognition. International Journal of Computer Vision, 25(1), 63–85.
Darom, T., & Keller, Y. (2012). Scale-invariant features for 3-D mesh models. IEEE Transactions on Image Processing, 21(5), 2758–2769. doi:10.1109/TIP.2012.2183142.
Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., & Burgard, W. (2012). An evaluation of the RGB-D SLAM system. In Proceedings of IEEE International Conference on Robotics and Automation (ICRA) (pp. 1691–1696).
Faulhammer, T., Aldoma, A., Zillich, M., & Vincze, M. (2015). Temporal integration of feature correspondences for enhanced recognition in cluttered and dynamic environments. In 2015 IEEE International Conference on robotics and automation (ICRA) (pp. 3003–3009). doi:10.1109/ICRA.2015.7139611
Fiolka, T., Stückler, J., Klein, D. A., Schulz, D., & Behnke, S. (2012). SURE: Surface entropy for distinctive 3D features. In Spatial cognition VIII (pp. 74–93). Springer.
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395. doi:10.1145/358669.358692.
Frome, A., Huber, D., Kolluri, R., Bülow, T., & Malik, J. (2004). Recognizing objects in range data using regional point descriptors. In Computer vision-ECCV 2004 (pp. 224–237). Springer.
Galvez-Lopez, D., & Tardos, J. (2012). Bags of binary words for fast place recognition in image sequences. IEEE Transactions on Robotics, 28(5), 1188–1197. doi:10.1109/TRO.2012.2197158.
Gong, Y., Lazebnik, S., Gordo, A., & Perronnin, F. (2013). Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12), 2916–2929.
Guo, Y., Bennamoun, M., Sohel, F., Lu, M., & Wan, J. (2014). 3D object recognition in cluttered scenes with local surface features: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(11), 2270–2287. doi:10.1109/TPAMI.2014.2316828.
Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J., & Kwok, N. (2015). A comprehensive performance evaluation of 3D local feature descriptors. International Journal of Computer Vision (pp. 1–24). doi:10.1007/s11263-015-0824-y.
Guo, Y., Sohel, F., Bennamoun, M., Lu, M., & Wan, J. (2013). Rotational projection statistics for 3D local surface description and object recognition. International Journal of Computer Vision, 105(1), 63–86. doi:10.1007/s11263-013-0627-y.
Guo, Y., Sohel, F., Bennamoun, M., Wan, J., & Lu, M. (2014). An accurate and robust range image registration algorithm for 3D object modeling. IEEE Transactions on Multimedia, 16(5), 1377–1390. doi:10.1109/TMM.2014.2316145.
Hu, L., & Nooshabadi, S. (2015). G-SHOT: GPU accelerated 3D local descriptor for surface matching. Journal of Visual Communication and Image Representation, 30, 343–349.
Johnson, A. E., & Hebert, M. (1999). Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(5), 433–449.
Knopp, J., Prasad, M., Willems, G., Timofte, R., & Van Gool, L. (2010). Hough transform and 3D SURF for robust three dimensional classification. In Computer vision–ECCV 2010 (pp. 589–602). Springer.
Leutenegger, S., Chli, M., & Siegwart, R. Y. (2011). BRISK: Binary robust invariant scalable keypoints. In 2011 IEEE international conference on computer vision (ICCV) (pp. 2548–2555). IEEE.
Leutenegger, S., Lynen, S., Bosse, M., Siegwart, R., & Furgale, P. (2015). Keyframe-based visual-inertial odometry using nonlinear optimization. The International Journal of Robotics Research, 34(3), 314–334.
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2), 91–110.
Malaguti, F., Tombari, F., Salti, S., Pau, D., & Di Stefano, L. (2012). Toward compressed 3D descriptors. In 2012 second international conference on 3D imaging, modeling, processing, visualization and transmission (3DIMPVT) (pp. 176–183). doi:10.1109/3DIMPVT.2012.9.
Marton, Z. C., Pangercic, D., Blodow, N., Kleinehellefort, J., & Beetz, M. (2010). General 3D modelling of novel objects from a single view. In 2010 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 3700–3705). IEEE.
Mian, A., Bennamoun, M., & Owens, R. (2010). On the repeatability and quality of keypoints for local feature-based 3D object retrieval from cluttered scenes. International Journal of Computer Vision, 89(2–3), 348–361.
Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.
Newcombe, R. A., Davison, A. J., Izadi, S., Kohli, P., Hilliges, O., Shotton, J., Molyneaux, D., Hodges, S., Kim, D., & Fitzgibbon, A. (2011). KinectFusion: Real-time dense surface mapping and tracking. In IEEE International Symposium on Mixed and augmented reality (ISMAR) (pp. 127–136).
Novatnack, J., & Nishino, K. (2008). Scale-dependent/invariant local 3D shape descriptors for fully automatic registration of multiple sets of range images. In Computer vision—ECCV 2008 (pp. 440–453). Springer.
Palossi, D., Tombari, F., Salti, S., Ruggiero, M., Stefano, L., & Benini, L. (2013). GPU-SHOT: Parallel optimization for real-time 3D local description. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 584–591).
Prakhya, S. M., Liu, B., & Lin, W. (2015). B-SHOT: A binary feature descriptor for fast and efficient keypoint matching on 3D point clouds. In 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS).
Prakhya, S. M., Liu, B., & Lin, W. (2016). Detecting keypoint sets on 3d point clouds via histogram of normal orientations. Pattern Recognition Letters. doi:10.1016/j.patrec.2016.06.002.
Prakhya, S. M., Liu, B., Lin, W., & Qayyum, U. (2015). Sparse depth odometry: 3D keypoint based pose estimation from dense depth data. In 2015 IEEE international conference on robotics and automation (ICRA)
Project Tango. https://www.google.com/atap/project-tango/
Rodolà, E., Albarelli, A., Bergamasco, F., & Torsello, A. (2012). A scale independent selection process for 3d object recognition in cluttered scenes. International Journal of Computer Vision, 102(1), 129–145.
Rodolà, E., Albarelli, A., Cremers, D., & Torsello, A. (2015). A simple and effective relevance-based point sampling for 3D shapes. Pattern Recognition Letters, 59, 41–47.
Rusu, R., & Cousins, S. (2011). 3D is here: Point Cloud Library (PCL). In 2011 IEEE international conference on robotics and automation (ICRA) (pp. 1–4). doi:10.1109/ICRA.2011.5980567
Rusu, R. B., Blodow, N., & Beetz, M. (2009). Fast point feature histograms (FPFH) for 3D registration. In ICRA’09. IEEE international conference on robotics and automation, 2009 (pp. 3212–3217). IEEE.
Rusu, R. B., Blodow, N., Marton, Z. C., & Beetz, M. (2008). Aligning point cloud views using persistent feature histograms. In IROS 2008. IEEE/RSJ international conference on intelligent robots and systems, 2008 (pp. 3384–3391). IEEE.
Salti, S., Tombari, F., & Di Stefano, L. (2014). SHOT: Unique signatures of histograms for surface and texture description. Computer Vision and Image Understanding, 125, 251–264.
Steder, B., Rusu, R., Konolige, K., & Burgard, W. (2011). Point feature extraction on 3D range scans taking into account object boundaries. In 2011 IEEE international conference on robotics and automation (ICRA) (pp. 2601–2608). doi:10.1109/ICRA.2011.5980187.
Strecha, C., Bronstein, A. M., M. M. B., & Fua, P. (2012). LDAHash: Improved matching with smaller descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(1).
Structure Sensor. http://structure.io/
Sturm, J., Bylow, E., Kahl, F., & Cremers, D. (2013). CopyMe3D: Scanning and printing persons in 3D. In Pattern recognition, Lecture Notes in Computer Science (pp. 405–414).
Tra, A. T., Lin, W., & Kot, A. (2015). Dominant SIFT : A novel compact descriptor. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP).
Tabia, H., Laga, H., Picard, D., & Gosselin, P. H. (2014). Covariance descriptors for 3D shape matching and retrieval. In 2014 IEEE Conference on computer vision and pattern recognition (CVPR) (pp. 4185–4192). IEEE.
Tombari, F., Salti, S., & Di Stefano, L. (2010). Unique shape context for 3D data description. In Proceedings of the ACM workshop on 3D object retrieval, 3DOR ’10 (pp. 57–62). New York, NY, USA: ACM. doi:10.1145/1877808.1877821.
Tombari, F., Salti, S., & Di Stefano, L. (2010). Unique signatures of histograms for local surface description. In Computer vision–ECCV 2010 (pp. 356–369). Springer.
Tombari, F., Salti, S., & Stefano, L. D. (2011). A combined texture-shape descriptor for enhanced 3D feature matching. In 2011 18th IEEE international conference on image processing (ICIP) (pp. 809–812). IEEE.
Tombari, F., Salti, S., & Stefano, L. D. (2013). Performance evaluation of 3D keypoint detectors. International Journal of Computer Vision, 102(1–3), 198–220. doi:10.1007/s11263-012-0545-4.
Trzcinski, T., Christoudias, M., & Lepetit, V. (2015). Learning image descriptors with boosting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3), 597–610.
Yang, X., & Cheng, K. T. (2014). Local difference binary for ultrafast and distinctive feature description. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(1), 188–194.
Zhong, Y. (2009). Intrinsic shape signatures: A shape descriptor for 3D object recognition. In 2009 IEEE 12th international conference on computer vision sorkshops (ICCV workshops) (pp. 689–696). doi:10.1109/ICCVW.2009.5457637
Zhou, W., Li, H., Hong, R., Lu, Y., & Tian, Q. (2015). BSIFT: Toward data-independent codebook for large scale image search. IEEE Transactions on Image Processing, 24(3), 967–979. doi:10.1109/TIP.2015.2389624.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Prakhya, S.M., Liu, B., Lin, W. et al. B-SHOT: a binary 3D feature descriptor for fast Keypoint matching on 3D point clouds. Auton Robot 41, 1501–1520 (2017). https://doi.org/10.1007/s10514-016-9612-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-016-9612-y