Abstract
The computer vision system is the technology that deals with identifying and detecting the objects of a particular class in digital images and videos. Local feature detection and description play an essential role in many computer vision applications like object detection, object classification, etc. The accuracy of these applications depends on the performance of local feature detectors and descriptors used in the methods. Over the past decades, new algorithms and techniques have been introduced with the development of machine learning and deep learning techniques. The machine learning techniques can lead the work to the next level when sufficient data is provided. Deep learning algorithms can handle a large amount of data efficiently. However, this may raise questions in a researcher’s mind about selecting the best algorithm and best method for a particular application to increase the performance. The selection of the algorithms highly depends on the type of application and amount of data to be handled. This encouraged us to write a comprehensive survey of local image feature detectors and descriptors from state-of-the-art to the recent ones. This paper presents feature detection and description methods in the visible band with their advantages and disadvantages. We also gave an overview of current performance evaluations and benchmark datasets. Besides, the methods and algorithms are described to find the features beyond the visible band. Finally, we concluded the survey with future directions. This survey may help researchers and serve as a reference in the field of the computer vision system.
Similar content being viewed by others
References
Abdel-Hakim AE, Farag AA (2006) Csift: A sift descriptor with color invariant characteristics. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol. 2, pp. 1978–1983. Ieee
Alahi A, Ortiz R, Vandergheynst P (2012) Freak: Fast retina keypoint. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 510–517. Ieee
Asada H, Brady M (1986) The curvature primal sketch. IEEE Trans Pattern Anal Mach Intell 1:2–14
Baeza-Yates R, Ribeiro-Neto B et al (1999) Modern information retrieval, vol 463. ACM press, New York
Basu S, Ganguly S, Mukhopadhyay S, DiBiano R, Karki M, Nemani R (2015) Deepsat: a learning framework for satellite imagery. In: Proceedings of the 23rd SIGSPATIAL international conference on advances in geographic information systems, pp. 1–10
Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (surf). Comput Vis Image Underst 110(3):346–359
Bianco S, Mazzini D, Pau DP, Schettini R (2015) Local detectors and compact descriptors for visual search: a quantitative comparison. Digit Signal Process 44:1–13
Cai H, Mikolajczyk K, Matas J (2010) Learning linear discriminant projections for dimensionality reduction of image descriptors. IEEE Trans Pattern Anal Mach Intell 33(2):338–352
Calonder M, Lepetit V, Ozuysal M, Trzcinski T, Strecha C, Fua P (2011) Brief: Computing a local binary descriptor very fast. IEEE Trans Pattern Anal Mach Intell 34(7):1281–1298
Calonder M, Lepetit V, Strecha C, Fua P (2010) Brief: Binary robust independent elementary features. In: European conference on computer vision, pp. 778–792. Springer, Berlin
Canclini A, Cesana M, Redondi A, Tagliasacchi M, Ascenso J, Cilla R (2013) Evaluation of low-complexity visual feature detectors and descriptors. In: 2013 18th International Conference on Digital Signal Processing (DSP), pp 1–7
Chao J, Al-Nuaimi A, Schroth G, Steinbach E (2013) Performance comparison of various feature detector-descriptor combinations for content-based image retrieval with jpeg-encoded query images. In: 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP), pp 029–034
Chen J, Tian J, Lee N, Zheng J, Smith RT, Laine AF (2010) A partial intensity invariant feature descriptor for multimodal retinal image registration. IEEE Trans Biomed Eng 57(7):1707–1718
Chen J, Wan L, Zhu J, Xu G, Deng M (2019) Multi-scale spatial and channel-wise attention for improving object detection in remote sensing imagery. IEEE Geosci Remote Sensing Lett 17(4):681–68
Cheung W, Hamarneh G (2007) N-sift: N-dimensional scale invariant feature transform for matching medical images. In: 2007 4th IEEE international symposium on biomedical imaging: from nano to macro, pp 720–723. IEEE
University of Maryland at College Park. Computer Science Center Rutkowski W, Rosenfeld A (1978) A comparison of corner detection techniques for chain-coded curves
Dahl AL, Aanæs H, Pedersen KS (2011) Finding the best feature detector-descriptor combination. In: 2011 international conference on 3d imaging, modeling, processing, visualization and transmission, pp. 318–325. IEEE
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. IEEE
Deriche R, Giraudon G (1993) A computational approach for corner and vertex detection. Int J Comput Vis 10(2):101–124
DeTone D, Malisiewicz T, Rabinovich A (2018) Superpoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 224–236
Dickscheid T, Schindler F, Förstner W (2011) Coding images with local features. Int J Comput Vis 94(2):154–174
Dong J, Soatto S (2015) Domain-size pooling in local descriptors: Dsp-sift. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5097–5106
Fei-Fei L, Fergus R, Perona P (2004) Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In: 2004 conference on computer vision and pattern recognition workshop, pp. 178-178. IEEE
Filipe S, Alexandre LA (2014) A comparative evaluation of 3d keypoint detectors in a rgb-d object dataset. In: 2014 international conference on computer vision theory and applications (VISAPP), vol. 1, pp. 476–483
Fu X, McCane B, Mills S, Albert M (2014) Nokmeans: Non-orthogonal k-means hashing. In: Asian conference on computer vision, pp. 162–177. Springer, Berlin
Fu Z, Qin Q, Luo B, Wu C, Sun H (2018) A local feature descriptor based on combination of structure and texture information for multispectral image matching. IEEE Geosci Remote Sensing Lett 16(1):100–104
Gauglitz S, Höllerer T, Turk M (2011) Evaluation of interest point detectors and feature descriptors for visual tracking. Int J Comput Vis 94(3):335
Georgiou T, Liu Y, Chen W, Lew M (2019) A survey of traditional and deep learning-based feature descriptors for high dimensional data in computer vision. Int J Multimed Inf Retr pp. 1–36
Geusebroek JM, Burghouts GJ, Smeulders AW (2005) The amsterdam library of object images. Int J Comput Vis 61(1):103–112
Ghosal S, Mehrotra R (1994) Zernike moment-based feature detectors. In: Proceedings of 1st international conference on image processing, vol. 1, pp. 934–938. IEEE
Gil A, Mozos OM, Ballesta M, Reinoso O (2010) A comparative evaluation of interest point detectors and local descriptors for visual slam. Mach Vis Appl 21(6):905–920
Gong Y, Lazebnik S, Gordo A, Perronnin F (2012) Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35(12):2916–2929
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset
Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Liu T, Wang X, Wang G, Cai J et al (2018) Recent advances in convolutional neural networks. Pattern Recogn 77:354–377
Haja A, Abraham S, Jähne B (2008) A comparison of region detectors for tracking. In: Joint pattern recognition symposium, pp. 112–121. Springer, Berlin
Harris CG, Stephens M et al. (1988) A combined corner and edge detector. In: Alvey vision conference, vol. 15, pp. 10–5244. Citeseer
Hartmann W, Havlena M, Schindler K (2014) Predicting matchability. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9–16
Hassaballah M, Amin A, Hammam A (2016) Image feature detectors and descriptors: foundations and applications. Stud Comput Intell 630:11–45
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778
Heinly J, Dunn E, Frahm JM (2012) Comparative evaluation of binary features. In: European conference on computer vision, pp. 759–773. Springer, Berlin
Helber P, Bischke B, Dengel A, Borth D (2019) Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE J Sel Top Appl Earth Obs Remote Sens 12(7):2217–2226
Ishii T, Simo-Serra E, Iizuka S, Mochizuki Y, Sugimoto A, Ishikawa H, Nakamura R (2016) Detection by classification of buildings in multispectral satellite imagery. In: 2016 23rd international conference on pattern recognition (ICPR), pp. 3344–3349. IEEE
Jarrett K, Kavukcuoglu K, Ranzato M, LeCun Y (2009) What is the best multi-stage architecture for object recognition? In: 2009 IEEE 12th international conference on computer vision, pp. 2146–2153. IEEE
Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp. 3304–3311. IEEE
Jiang J, Liu F, Xu Y, Huang H et al (2019) Multi-spectral rgb-nir image classification using double-channel cnn. IEEE Access 7:20607–20613
Kaneva B, Torralba A, Freeman WT (2011) Evaluation of image features using a photorealistic virtual world. In: 2011 International conference on computer vision, pp. 2282–2289. IEEE
Kangas V, et al. (2011) Comparison of local feature detectors and descriptors for visual object categorization
Ke Y, Sukthankar R (2004) Pca-sift: A more distinctive representation for local image descriptors. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004. CVPR 2004., vol. 2, pp. II–II. IEEE
Krasin I, Duerig T, Alldrin N, Ferrari V, Abu-El-Haija S, Kuznetsova A, Rom H, Uijlings J, Popov S, Veit A, et al. (2017) Openimages: A public dataset for large-scale multi-label and multi-class image classification. 2(3):2–3 Dataset available from https://github.com/openimages
Krawiec K, Bhanu B (2005) Visual learning by coevolutionary feature synthesis. IEEE Trans Syst Man Cybern Part B (Cybernetics) 35(3):409–425
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105
Kuznetsova A, Rom H, Alldrin N, Uijlings J, Krasin I, Pont-Tuset J, Kamali S, Popov S, Malloci M, Kolesnikov A, Duerig T, Ferrari V (2020) The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale. IJCV
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 2169–2178. IEEE
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Lee MH, Park IK (2017) Performance evaluation of local descriptors for maximally stable extremal regions. J Visual Commun Image Represent 47:62–72
Leng C, Zhang H, Li B, Cai G, Pei Z, He L (2018) Local feature descriptor for image matching: A survey. IEEE Access 7:6424–6434
Lepetit V, Fua P (2006) Keypoint recognition using randomized trees. IEEE Trans Pattern Anal Mach Intell 28(9):1465–1479
Leutenegger S, Chli M, Siegwart RY (2011) Brisk: Binary robust invariant scalable keypoints. In: 2011 International conference on computer vision, pp. 2548–2555. IEEE
Li K, Wan G, Cheng G, Meng L, Han J (2020) Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS J Photogramm Remote Sens 159:296–307
Li Y, Liu W, Li X, Huang Q, Li X (2014) Ga-sift: A new scale invariant feature transform for multispectral image using geometric algebra. Inf Sci 281:559–572
Li Y, Wang S, Tian Q, Ding X (2015) A survey of recent advances in visual feature detection. Neurocomputing 149:736–751
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, pp. 740–755. Springer, Berlin
Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, Pietikäinen M (2020) Deep learning for generic object detection: A survey. Int J Comput Vis 128(2):261–318
Lowe D (1999) Bobject recognition from local scale-invariant features, In: Proceeding of 7th international conference. Computer Vision, Kerkyra, Greece pp. 1150–1157
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Luo Z, Zhou L, Bai X, Chen H, Zhang J, Yao Y, Li S, Fang T, Quan L (2020) Aslfeat: Learning local features of accurate shape and localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6589–6598
Ma J, Jiang X, Fan A, Jiang J, Yan J (2020) Image matching from handcrafted to deep features: A survey. Int J Comput Vis pp. 1–57
Matas J, Chum O, Urban M, Pajdla T (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput 22(10):761–767
Mikolajczyk K, Schmid C (2002) An affine invariant interest point detector. In: European conference on computer vision, pp. 128–142. Springer, Berlin
Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. IEEE Trans Pattern Anal Mach Intell 27(10):1615–1630
Miksik O, Mikolajczyk K (2012) Evaluation of local detectors and descriptors for fast feature matching. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pp. 2681–2684. IEEE
Moravec H (1977) Towards automatic visual obstacle avoidance. In: Proceeding of the 5th International Joint Conference on Artificial Intelligence, 1977
Moreels P, Perona P (2007) Evaluation of features detectors and descriptors based on 3d objects. Int J Comput Vis 73(3):263–284
Morel J, Yu G (2016) Affine-sift(asift). http://www.cmap.polytechnique.fr/yu/research/ASIFT/demo.html
Morel JM, Yu G (2009) Asift: A new framework for fully affine invariant image comparison. SIAM J Imaging Sci 2(2):438–469
Nai K, Li Z, Li G, Wang S (2018) Robust object tracking via local sparse appearance model. IEEE Trans Image Process 27(10):4958–4970
Noh H, Araujo A, Sim J, Weyand T, Han B (2017) Large-scale image retrieval with attentive deep local features. In: Proceedings of the IEEE international conference on computer vision, pp. 3456–3465
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
Ono Y, Trulls E, Fua P, Yi KM (2018) Lf-net: learning local features from images. In: Advances in neural information processing systems, pp. 6234–6244
Patel MI, Thakar VK, Shah SK (2016) Image registration of satellite images with varying illumination level using hog descriptor based surf. Proc Comput Sci 93:382–388
Pernici F, Del Bimbo A (2013) Object tracking by oversampling local features. IEEE Trans Pattern Anal Mach Intell 36(12):2538–2551
Perronnin F, Sánchez J, Mensink T (2010) Improving the fisher kernel for large-scale image classification. In: European conference on computer vision, pp. 143–156. Springer, Berlin
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Restrepo MI, Mundy JL (2012) An evaluation of local shape descriptors in probabilistic volumetric scenes. In: BMVC, pp. 1–11
Rosten E, Drummond T (2006) Machine learning for high-speed corner detection. In: European conference on computer vision, pp. 430–443. Springer, Berlin
Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: An efficient alternative to sift or surf. In: 2011 International conference on computer vision, pp. 2564–2571. IEEE
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2014) Imagenet large scale visual recognition challenge
Salahat E, Qasaimeh M (2017) Recent advances in features extraction and description algorithms: A comprehensive survey. In: 2017 IEEE international conference on industrial technology (ICIT), pp. 1059–1063. IEEE
Schmid C, Mohr R (1997) Local grayvalue invariants for image retrieval. IEEE Trans Pattern Anal Mach Intell 19(5):530–535
Schmid C, Mohr R, Bauckhage C (2000) Evaluation of interest point detectors. Int J Comput Vis 37(2):151–172
Shen F, Wang H (2002) Corner detection based on modified hough transform. Pattern Recogn Lett 23(8):1039–1049
Shen S Image classification of fashion-mnist dataset using long short-term memory networks
Shen X, Wang C, Li X, Yu Z, Li J, Wen C, Cheng M, He Z (2019) Rf-net: An end-to-end image matching network based on receptive field. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8132–8140
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Sivic J, Zisserman A (2003) Video google: A text retrieval approach to object matching in videos. In: null, p. 1470. IEEE
Smith SM, Brady JM (1997) Susan-a new approach to low level image processing. Int J Comput Vis 23(1):45–78
Srisuk S, Suwannapong C, Kitisriworapan S, Kaewsong A, Ongkittikul S (2019) Performance evaluation of real-time object detection algorithms. In: 2019 7th International Electrical Engineering Congress (iEECON), pp. 1–4. IEEE
Strecha C, Bronstein A, Bronstein M, Fua P (2011) Ldahash: Improved matching with smaller descriptors. IEEE Trans Pattern Anal Mach Intell 34(1):66–78
Strecha C, Lindner A, Ali K, Fua P (2009) Training for task specific keypoint detection. In: Joint pattern recognition symposium, pp. 151–160. Springer, Berlin
Strecha C, Von Hansen W, Van Gool L, Fua P, Thoennessen U (2008) On benchmarking camera calibration and multi-view stereo for high resolution imagery. In: 2008 IEEE conference on computer vision and pattern recognition, pp. 1–8. IEEE
Su X, Lin W, Zheng X, Han X, Chu H, Zhang X (2013) A new local-main-gradient-orientation hog and contour differences based algorithm for object classification. In: 2013 IEEE international symposium on circuits and systems (ISCAS), pp. 2892–2895. IEEE
Sun Z, Bebis G, Miller R (2004) Object detection using feature subset selection. Pattern Recogn 37(11):2165–2176
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9
Tan SY, Arshad H, Abdullah A (2019) Distinctive accuracy measurement of binary descriptors in mobile augmented reality. PloS one 14(1):e0207191
\(\kappa \alpha \iota \) Tomasi, J.S.: Good features to track. In: \(\varSigma \tau \)o: Proceedings of the 1994 IEEE computer society conference on computer vision and pattern recognition. CVPR (1994)
Torr PHS (1995) Motion segmentation and outlier detection. Ph.D. thesis, University of Oxford England
Trujillo L, Olague G (2006) Synthesis of interest point detectors through genetic programming. In: Proceedings of the 8th annual conference on Genetic and evolutionary computation, pp. 887–894
Trzcinski T, Christoudias M, Fua P, Lepetit V (2013) Boosting binary keypoint descriptors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2874–2881
Trzcinski T, Christoudias M, Lepetit V, Fua P (2012) Learning image descriptors with the boosting-trick. In: Advances in neural information processing systems, pp. 269–277
Tuytelaars T, Mikolajczyk K (2008) Local invariant feature detectors: a survey. Found Trends® Comput Graphics Vis 3(3):177–280
Uchida Y (2016) Local feature detectors, descriptors, and image representations: A survey. arXiv preprint arXiv:1607.08368
Uehara K, Nosato H, Murakawa M, Nakamura R, Miyamoto H, Sakanashi H (2020) Multi-channel higher-order local autocorrelation for object detection on satellite images. Int J Remote Sens 41(2):752–771
Uehara K, Sakanashi H, Nosato H, Murakawa M, Miyamoto H, Nakamura R (2017) Object detection of satellite images using multi-channel higher-order local autocorrelation. In: 2017 IEEE international conference on systems, man, and cybernetics (SMC), pp. 1339–1344. IEEE
Verdie Y, Yi K, Fua P, Lepetit V (2015) Tilde: A temporally invariant learned detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5279–5288
Vollmer M, Möllmann KP (2017) Infrared thermal imaging: fundamentals, research and applications. Wiley, New York
Wang R, Zhang W, Shi Y, Wang X, Cao W (2019) Ga-orb: A new efficient feature extraction algorithm for multispectral images based on geometric algebra. IEEE Access 7:71235–71244
Wang S (2011) A review of gradient-based and edge-based feature extraction methods for object detection. In: 2011 IEEE 11th international conference on computer and information technology, pp. 277–282. IEEE
Yao Q, Hu X, Lei H (2019) Geospatial object detection in remote sensing images based on multi-scale convolutional neural networks. In: IGARSS 2019-2019 IEEE international geoscience and remote sensing symposium, pp. 1450–1453. IEEE
Yi KM, Trulls E, Lepetit V, Fua P (2016) Lift: Learned invariant feature transform. In: European conference on computer vision, pp. 467–483. Springer, Berlin
Yi KM, Verdie Y, Fua P, Lepetit V (2016) Learning to assign orientations to feature points. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 107–116
Ying X, Wang Q, Li X, Yu M, Jiang H, Gao J, Liu Z, Yu R (2019) Multi-attention object detection model in remote sensing images based on multi-scale. IEEE Access 7:94508–94519
Zhang X, Yu FX, Karaman S, Chang SF (2017) Learning discriminative and transformation covariant local feature detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6818–6826
Zhang Z, Deriche R, Faugeras O, Luong QT (1995) A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry. Artif Intell 78(1–2):87–119
Zheng L, Yang Y, Tian Q (2017) Sift meets cnn: A decade survey of instance retrieval. IEEE Trans Pattern Anal Mach Intell 40(5):1224–1244
Zuniga OA (1983) Corner detection using the facet model. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Joshi, K., Patel, M.I. Recent advances in local feature detector and descriptor: a literature survey. Int J Multimed Info Retr 9, 231–247 (2020). https://doi.org/10.1007/s13735-020-00200-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13735-020-00200-3