Abstract
Establishing local visual correspondence between video frames is an important and challenging problem in many vision based applications. Local keypoint detection and description based pixel-level matching is a typical way for visual correspondence estimation. Unlike traditional local keypoint descriptor based methods, this paper proposes a comprehensive yet low-dimensional local feature descriptor based on superpixels generated by over segmentation. The proposed local feature descriptor extracts shape feature, texture feature, and color feature from superpixels by orientated center-boundary distance (OCBD), gray-level co-occurrence matrix (GLCM), and saturation histogram (SHIST), respectively. The types of features are more comprehensive than existing descriptors which extract only one specific kind of feature. Experimental results on the widely used Middlebury optical flow dataset prove that the proposed superpixel descriptor achieves triple accuracy compared with the state-of-the-art ORB descriptor which has the same dimension of features with the proposed one. In addition, since the dimension of the proposed superpixel descriptor is low, it is convenient for matching and memory-efficient for hardware implementation.







Similar content being viewed by others
References
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2012) SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282
Alahi A, Ortiz R, Vandergheynst P (2012) FREAK: fast retina keypoint. In: Proceedings of the international conference on computer vision and pattern recognition, pp 510–517
Awad AI, Hassaballah M (2016) Image feature detectors and descriptors. Springer International Publishing, Cham
Baker S, Scharstein D, Lewis JP, Roth S, Black MJ, Szeliski R (2011) A database and evaluation methodology for optical flow. Int J Comput Vis 92(1):1–31
Bay H, Tuytelaars T, Van Gool L (2006) SURF: speeded up robust features. In: Proceedings of the European conference on computer vision, pp 404–417
Beaudet P (1978) Rotationally invariant image operators. In: Proceedings of the international conference on pattern recognition, pp 579–583
Calonder M, Lepetit V, Strecha C, Fua P (2010) BRIEF: binary robust independent elementary features. In: Proceedings of the European conference on computer vision, pp 778–792
Chen J, Li Z, Huang B (2017) Linear spectral clustering superpixel. IEEE Trans Image Process 26(7):3317–3330
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619
Daribo I, Florencio D, Cheung G (2014) Arbitrarily shaped motion prediction for depth video compression using arithmetic edge coding. IEEE Trans Image Process 23(11):4696–4708
Du S, Ikenaga T (2018) Low-dimensional superpixel descriptor for visual correspondence estimation in video. In: Proceedings of the international symposium on intelligent signal processing and communication systems, pp 287–291
Fan B, Wang Z, Wu F (2015) Local image descriptor: modern approaches. Springer, Berlin
Felzenszwalb P, Huttenlocher D (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Guo Y, Zeng H, Mu Z-C, Zhang F (2010) Rotation-invariant DAISY descriptor for keypoint matching and its application in 3D reconstruction. In: Proceedings of the international conference on signal processing, pp 1198–1201
Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 6:610–621
Harris C, Stephens M (1988) A combined coer and edge detector. In: Proceedings of the Alvey vision conference, pp 147–151
Horn BKP, Schunck BG (1981) Determining optical flow. Artif Intell 17 (1–3):185–203
Hu W, Li W, Zhang X, Maybank S (2015) Single and multiple object tracking using a multi-feature joint sparse representation. IEEE Trans Pattern Anal Mach Intell 37(4):816–833
Ke Y, Sukthankar R (2004) PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of the international conference on computer vision and pattern recognition, pp 506–513
Khan N, McCane B, Mills S (2015) Better than SIFT? Mach Vision Appl 26(6):819–836
Leutenegger S, Chli M, Siegwart R Y (2011) BRISK: binary robust invariant scalable keypoints. In: Proceedings of the international conference computer vision, pp 2548–2555
Levinshtein A, Stere A, Kutulakos K, Fleet D, Dickinson S, Siddiqi K (2009) Turbopixels: fast superpixels using geometric flows. IEEE Trans Pattern Anal Mach Intell 31(12):2290–2297
Liu C, Yuen J, Torralba A (2011) SIFT flow: dense correspondence across scenes and its applications. IEEE Trans Pattern Anal Mach Intell 33(5):978–994
Liu Y, Nie L, Han L, Zhang L, Rosenblum D S (2015) Action2Activity: recognizing complex activities from sensor data. In: Proceedings of the international conference on artificial intelligence, pp 1617–1623
Liu L, Cheng L, Liu Y, Jia Y, Rosenblum D S (2016) Recognizing complex activities by a probabilistic interval-based model. In: Proceedings of the AAAI conference on artificial intelligence, pp 1266–1272
Liu Y, Nie L, Liu L, Rosenblum D S (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Miao Z, Jiang X (2013) Interest point detection using rank order LoG filter. Pattern Recognit 46:2890–2901
Po L-M, Ma W-C (1996) A novel four-step search algorithm for fast block motion estimation. IEEE Trans Circuits Syst Video Technol 6(3):313–317
Rosten E, Drummond T (2006) Machine learning for high-speed corner detection. In: Proceedings of the European conference on computer vision, pp 430–443
Rosten E, Porter R, Drummond T (2010) Faster and better: a machine learning approach to corner detection. IEEE Trans Pattern Anal Mach Intell 32(1):105–119
Rublee E, Rabaud V, Konolige K, Bradski G (2011) ORB: an efficient alternative to SIFT or SURF. In: Proceedings of the international conference computer vision, pp 2564–2571
Schwartz WR, Pedrini H (2006) Textured image segmentation based on spatial dependence using a Markov random field model. In: Proceedings of the international conference on image processing, pp 2449–2452
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Smith SM, Brady JM (1997) SUSAN: a new approach to low level image processing. Int J Comput Vis 23(1):45–78
Soh L-K, Tsatsoulis C (1999) Texture analysis of SAR sea ice imagery using gray level co-occurrence matrices. IEEE Trans Geosci Remote Sens 37(2):780–795
Yang P, Yang G (2016) Feature extraction using dual-tree complex wavelet transform and gray level co-occurrence matrix. Neurocomputing 197:212–220
Acknowledgements
This work was supported by KAKENHI (16K13006) and Waseda University Grant for Special Research Projects (2017B-261).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Du, S., Ikenaga, T. Low-dimensional superpixel descriptor and its application in visual correspondence estimation. Multimed Tools Appl 78, 19457–19472 (2019). https://doi.org/10.1007/s11042-019-7248-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-7248-6