Abstract
The Earth Mover’s Distance (EMD) is a similarity measure successfully applied to multidimensional distributions in numerous domains. Although the EMD yields very effective results, its high computational time complexity still remains a real bottleneck. Existing approaches used within a filter-and-refine framework aim at reducing the number of exact distance computations to alleviate query time cost. However, the refinement phase in which the exact EMD is computed dominates the overall query processing time. To this end, we propose to speed up the refinement phase by applying a novel feasible initialization technique (INIT) for the EMD computation which reutilizes the state-of-the-art lower bound IM-Sig. Our experimental evaluation over three real-world datasets points out the efficiency of our approach (This work is partially based on [12]).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Assent, I., Wenning, A., Seidl, T.: Approximation techniques for indexing the earth mover’s distance in multimedia databases. In: ICDE, p. 11 (2006)
Cohen, S.D., Guibas, L.J.: The earth mover’s distance: lower bounds and invariance under translation, Technical report. Stanford University (1997)
Gondzio, J.: Interior point methods 25 years later. EJOR 218(3), 587–601 (2012)
Hillier, F., Lieberman, G.: Introduction to Linear Programming. McGraw-Hill, New York (1990)
Hinneburg, A., Lehner, W.: Database support for 3D-protein data set analysis. In: SSDBM, pp. 161–170 (2003)
Kusner, M.J., Sun, Y., Kolkin, N.I., Weinberger, K.Q.: From word embeddings to document distances. In: ICML, pp. 957–966 (2015)
Lehmann, T., et al.: Content-based image retrieval in medical applications. Methods Inf. Med. 43(4), 354–361 (2004)
Pele, O., Werman, M.: A linear time histogram metric for improved SIFT matching. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 495–508. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88690-7_37
Rubner, Y., Tomasi, C., Guibas, L.: A metric for distributions with applications to image databases. In: ICCV, pp. 59–66 (1998)
Ruttenberg, B.E., Singh, A.K.: Indexing the earth mover’s distance using normal distributions. PVLDB 5(3), 205–216 (2011)
Seidl, T., Kriegel, H.: Optimal multi-step k-nearest neighbor search. In: SIGMOD, pp. 154–165 (1998)
Uysal, M.S.: Efficient Similarity Search in Large Multimedia Databases. Apprimus Verlag (2017)
Uysal, M.S., et al.: Efficient filter approximation using the EMD in very large multimedia databases with feature signatures. In: CIKM, pp. 979–988 (2014)
Vanderbei, R.J., Progr, L.: Foundations and Extensions. Springer, US (2014)
Vandersmissen, B., et al.: The rise of mobile and social short-form video: an in-depth measurement study of vine. In: SoMuS, vol. 1198, pp. 1–10 (2014)
Wichterich, M., et al.: Efficient emd-based similarity search in multimedia databases via flexible dimensionality reduction. In: SIGMOD, pp. 199–212 (2008)
Xu, J., Zhang, Z., et al.: Efficient and effective similarity search over probabilistic data based on earth mover’s distance. PVLDB 3(1), 758–769 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Uysal, M.S., Driessen, K., Brockhoff, T., Seidl, T. (2017). Fast Similarity Search with the Earth Mover’s Distance via Feasible Initialization and Pruning. In: Beecks, C., Borutta, F., Kröger, P., Seidl, T. (eds) Similarity Search and Applications. SISAP 2017. Lecture Notes in Computer Science(), vol 10609. Springer, Cham. https://doi.org/10.1007/978-3-319-68474-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-68474-1_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68473-4
Online ISBN: 978-3-319-68474-1
eBook Packages: Computer ScienceComputer Science (R0)