Abstract
A novel method for shape analysis and similarity measurement is introduced based on a time series matching approach. It applies to shapes represented through one-dimensional signals and has as objectives to utilize efficiently the provided information and to optimize the shape matching process. The new technique is tested on boundaries from leaf images, after their conversion into 1D sequences using either the Centroid Contour Distance (CCD) or the Angle code (AC) measurements. In the core of the new method lies the ‘time delay’-based transformation of a given 1D sequence to an ensemble of vectors embedded in a multivariate phase space. The resulting point set is considered as representative of the leaf identity. Inter-leaf comparisons are carried out in a pairwise fashion by employing the multidimensional, Wald–Wolfowitz, statistical test for the ‘two-sample problem’, which implicitly performs shape matching and similarity quantification. The comparative experimentation shows that the complexity of our method is moderate, while the leaf retrieval performance, compared to that achieved by standard matching procedures usually employed with the CCD and AC representations, is greatly improved.
Similar content being viewed by others
References
Nam Y, Hwang E, Kim D (2005) CLOVER: a mobile content-based leaf image retrieval system. In: Digital libraries: implementing strategies and sharing experiences. LNCS, vol 3815, pp 139–148. doi:10.1007/11599517_16
Belhumeur PN, Chen D, Feiner S, Jacobs DW, Kress WJ, Ling H, Lopez I, Ramamoorthi R, Sheorey S, White S, Zhang L (2008) Searching the world’s herbaria: a system for visual identification of plant species. In: ECCV, part IV. LNCS, vol 5305, pp 116–129. doi:10.1007/978-3-540-88693-8_9
Kebapci H, Yanicoglou B, Unal G (2010) Plant image retrieval using color, shape and texture features. Comput J. doi:10.1093/comjnl/bxq037
Zhang D, Lu G (2004) Review of shape representation and description techniques. Pattern Recogn 37(1):1–19
Mehtre BVM, Kankanhalli MS, Lee WF (1997) Shape measures for content based image retrieval: a comparison. Inf Process Manag 33(3). doi:10.1016/S0306-4573(96)00069-6
Wang Z, Chi W, Feng D (2003) Shape based leaf image retrieval. IEE Proc Vision Image Signal Process 150(1):34–43. doi:10.1049/ip-vis:20030160
Mokhtarian F, Abbasi S (2004) Matching shapes with self-intersections: application to leaf classification. Proc IEEE Trans Image 13(5):653–661. doi:10.1109/TIP.2004.826126
Nam Y, Hwang E, Kim D (2008) A similarity-based leaf image retrieval scheme: joining shape and venation features. Comput Vis Image Underst 110(2):245–259. doi:10.1016/j.cviu.2007.08.002
Lee C-L, Chen S-Y (2006) Classification of leaf images. Int J Imaging Syst Technol 16(1):15–23
Cabalero C, Aranda M (2010) Plant species identification using leaf image retrieval. In: Proceedings of the ACM international conference on image and video retrieval, Xi’an, China, pp 327–334. doi:10.1145/1816041.1816089
Casanova D, Junior J Sa, Bruno Od (2009) Plant leaf identification using Gabor wavelets. Int J Imaging Syst Technol 19(3):236–243. doi:10.1002/ima.v19:3
Park J, Jun E, Nam Y (2008) Utilizing venation features for efficient leaf image retrieval. J Syst Softw 81(1):71–82. doi:10.1016/j.jss.2007.05.001
Shen Y, Zhou C, Lin K (2005) Leaf image retrieval using a shape based method. In: IFIP international federation for information processing 187/2005, pp 711–719. doi:10.1007/0-387-29295-0_77
Peng HL, Chen SY (1997) Trademark shape recognition using closed contours. Pattern Recogn Lett 18(8):791–803. doi:10.1016/S0167-8655(97)00050-0
Gdalyahu Y, Weinshall D (1999) Flexible syntactic matching of curves and its application to automatic hierarchical classification of silhouettes. In: IEEE transactions on PAMI, vol 21(12), pp 1312–1328. doi:10.1109/34.817410
Santosh KC (2010) Use of dynamic time warping for object shape classification through signature. Kathmandu Univ J Sci Eng Technol 6:33–49. doi:10.3126/kuset.v6i1.3308
Tak YS (2007) A leaf image retrieval scheme based on partial dynamic time warping and two level-filtering. In Wei D, Miyazaki T, Paik I (eds) Proceedings of the 7th IEEE international conference on computer and information technology—CIT 2007. IEEE Computer Society, Los Alamitos, pp 633–638. doi:10.1109/CIT.2007.158
Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape context. IEEE Trans Pattern Anal Mach Intell 24(4):509–522. doi:10.1109/34.993558
Liang H, Jacobs D (2007) Shape classification using the inner-distance. IEEE Trans Pattern Anal Mach Intell 29(2):286–299. doi:10.1109/TPAMI.2007.41
Wu SG, Bao FS, Xu EY, Wang Y-X, Chang Y-F, Xiang Q-L (2007) A leaf recognition algorithm for plant classification using probabilistic neural network. In: IEEE 7th international symposium on signal processing and information technology http://flavia.sourceforge.net/, pp 11–16. doi:10.1109/ISSPIT.2007.4458016
Chang C, Hwang S, Buehrer D (1991) A shape recognition scheme based on relative distances of feature points from the centroid. Pattern Recogn 24(11):1053–1063. doi:10.1016/0031-3203(91)90121-K
Friedman JH, Rafsky LC (1979) Multivariate generalizations of the Wald–Wolfowitz and Smirnov two-sample tests. Ann Stat 7(4):697–717. doi:10.1214/aos/1176344722
Abarbanel HDI (1996) Analysis of observed chaotic data. Springer Verlag, New York
Chan H-L, Fang S-C, Chao P-K, Wang C-L, Wei J-D (2009) Phase-space reconstruction of electrocardiogram for heartbeat classification. In: WC IFMBE proceedings, vol 25(4), pp 1234–1237. doi:10.1007/978-3-642-03882-2_327
Laskaris N, Zafeiriou S, Garefa L (2009) Use of random time-intervals (RTIs) generation for biometric verification. Pattern Recogn 42(11):2787–2796. doi:10.1016/j.patcog.2008.12.028
Peyre G (2009) Manifold models for signals and images. Comput Vis Image Underst 113(2):249–260. doi:10.1016/j.cviu.2008.09.003
Cao L (1997) Practical method for determining the minimum embedding dimension of a scalar time series. Phys D 110(1–2):43–50
Rubner Y, Puzicha J, Tomasi C, Buhmann JM (2001) Empirical evaluation of dissimilarity measures for color and texture. Comput Vis Image Underst 84:25–43
Theoharatos Ch, Lakaris N, Economou G, Fotopoulos S (2005) A generic scheme for color image retrieval based on the multivariate Wald–Wolfowitz test. IEEE Trans Knowl Data Eng 17(6):808–819. doi:10.1109/TKDE.2005.85
Zahn CT (1971) Graph–theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput C 20(1). doi:10.1109/T-C.1971.223083
Dijkstra EW (1959) A note on two problems in connection with graph. Numer Math 1(1):269–271. doi:10.1007/BF01386390
Theoharatos Ch, Laskaris N, Economou G, Fotopoulos S (2004) A similarity measure for color image retrieval and indexing based on the multivariate two sample problem. In: Proceedings of EUSIPCO, Vienna, Austria.
Author information
Authors and Affiliations
Corresponding author
Appendix A. The multivariate Wald–Wolfowitz test
Appendix A. The multivariate Wald–Wolfowitz test
The Wald–Wolfowitz multivariate statistical test [22] assesses the commonality between two different sets of multivariate observations.
The output of the test can be expressed as the probability that two point-samples are coming from the same distribution. Its great advantage is that it is model-free and this stems from the graph–theoretic origin of the test, which is actually based on the concept of MST graph [29, 30]. The MST is a spanning tree containing exactly (N − 1) edges, for which the sum of edge weights is minimum. In WW-test, the graph is built over points in Rd: a single node corresponds to every given point, the weight associated with every possible edge is the corresponding interpoint Euclidean distance. WW-test can be used to test the hypothesis H 0, whether any two given multidimensional point samples {Xi}i = 1:m and {Yi}i = 1:n are coming from the same multivariate distribution. A great advantage is that no a priori assumption about the distribution of points in the two samples is a prerequisite [31].
In the first step, the sample identity of each point is not taken into account and the MST of the overall sample is constructed.
Then, based on the sample identities of the points, a test statistic R is computed. R is the total number of runs, while a run is defined as a consecutive sequence of identical sample identities. Rejection of H 0 is for small values of R. The null distribution of this statistic has been derived, based on combinatorial analysis. It has been shown that the quantity
approaches (asymptotically) the standard normal distribution, while the mean E[R] and variance Var[R/C] of R depend on the sizes m and n of the two point-samples and can be computed using the following analytical expressions:
where N = m + n, C is the number of edge pairs sharing a common node defined as \( C = \frac{1}{2}\sum\nolimits_{i - 1}^{N} {d{}_{i}(d_{i} - 1)} \) and d i is the degree of the ith node.
The above analysis enables the computation of the significance level (and p value) for the acceptance of the hypothesis H 0.
Rights and permissions
About this article
Cite this article
Fotopoulou, F., Laskaris, N., Economou, G. et al. Advanced leaf image retrieval via Multidimensional Embedding Sequence Similarity (MESS) method. Pattern Anal Applic 16, 381–392 (2013). https://doi.org/10.1007/s10044-011-0254-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-011-0254-6