skip to main content
10.1145/2791347.2791384acmotherconferencesArticle/Chapter ViewAbstractPublication PagesssdbmConference Proceedingsconference-collections
research-article

Efficient similarity search in scientific databases with feature signatures

Published: 29 June 2015 Publication History

Abstract

The recent rapid growth of scientific data necessitates efficient similarity search techniques for which convenient object representation models are of vital importance. Feature signatures denoting highly flexible object feature representations have increasingly gained attention for which corresponding efficiency improvement techniques are developed. In this paper, we focus on efficient query processing with the well-known Earth Mover's Distance (EMD) on databases of feature signatures, and propose efficient approximation techniques successfully applicable to high-dimensional feature signatures via dimensionality reduction, guaranteeing both completeness and no false-dismissal within a filter-and-refine architecture. Rigorous experiments on real world data indicate a considerable reduction in the number of EMD computations and high efficiency of the proposed techniques which significantly reduce the query processing time.

References

[1]
A. Andoni, P. Indyk, and R. Krauthgamer. Earth mover distance over high-dimensional spaces. SODA, pages 343--352, 2008.
[2]
A. Armiti and M. Gertz. Geometric graph matching and similarity: A probabilistic approach. SSDBM '14, pages 27:1--27:12, 2014.
[3]
I. Assent, A. Wenning, and T. Seidl. Approximation techniques for indexing the earth mover's distance in multimedia databases. In ICDE, page 11, 2006.
[4]
I. Assent, M. Wichterich, T. Meisen, and T. Seidl. Efficient similarity search using the earth mover's distance for large multimedia databases. In ICDE, pages 307--316, 2008.
[5]
I. Assent, M. Wichterich, and T. Seidl. Adaptable distance functions for similarity-based multimedia retrieval. Datenbank-Spektrum, 6(19):23--31, 2006.
[6]
C. Beecks. Distance-based similarity models for content-based multimedia retrieval. PhD thesis, RWTH Aachen University, 2013.
[7]
C. Beecks, M. S. Uysal, and T. Seidl. Signature quadratic form distance. In CIVR, p. 438--445, 2010.
[8]
C. Böhm, S. Berchtold, and D. A. Keim. Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Computing Surveys, 33:322--373, 2001.
[9]
R. S. Chavez and T. F. Heatherton. Representational similarity of social and valence information in the medial pfc. J. Cogn. Neuroscience, 27(1):73--82, 2015.
[10]
R. Cheng, L. Chen, J. Chen, and X. Xie. Evaluating probability threshold k-nearest-neighbor queries over uncertain data. In EDBT 2009, pages 672--683, 2009.
[11]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, pages 248--255, June 2009.
[12]
C. Faloutsos, M. Ranganathan, and Y. Manolopoulos. Fast subsequence matching in time-series databases. SIGMOD, 23(2):419--429, May 1994.
[13]
D. Fenz, D. Lange, A. Rheinländer, F. Naumann, and U. Leser. Efficient similarity search in very large string sets. In SSDBM 2012, pages 262--279, 2012.
[14]
F. Hillier and G. Lieberman. Introduction to Linear Programming. McGraw-Hill, 1990.
[15]
A. Hinneburg and W. Lehner. Database support for 3d-protein data set analysis. In SSDBM, pages 161--170, 2003.
[16]
M. E. Houle, X. Ma, M. Nett, and V. Oria. Dimensional testing for multi-step similarity search. In ICDM, pages 299--308, 2012.
[17]
F. Korn, N. Sidiropoulos, C. Faloutsos, E. L. Siegel, and Z. P. Fast nearest neighbor search in medical image databases. In VLDB, pages 215--226, 1996.
[18]
D. Nistér and H. Stewénius. Scalable recognition with a vocabulary tree. In CVPR, pages 2161--2168, 2006.
[19]
S. Nutanong, N. Carey, Y. Ahmad, A. S. Szalay, and T. B. Woolf. Adaptive exploration for large-scale protein analysis in the molecular dynamics database. In SSDBM, pages 45:1--45:4, 2013.
[20]
Y. Rubner, C. Tomasi, and L. J. Guibas. The earth mover's distance as a metric for image retrieval. Int. Journal of Computer Vision, 40(2):99--121, 2000.
[21]
B. E. Ruttenberg and A. K. Singh. Indexing the earth mover's distance using normal distributions. PVLDB, 5(3):205--216, 2011.
[22]
T. Seidl and H.-P. Kriegel. Optimal multi-step k-nearest neighbor search. In SIGMOD, pages 154--165, 1998.
[23]
J. Strötgen, M. Gertz, and C. Junghans. An event-centric model for multilingual document similarity. In ACM SIGIR, pages 953--962, 2011.
[24]
H. Tamura, S. Mori, and T. Yamawaki. Textural features corresponding to visual perception. TSMC, 8(6):460--473, 1978.
[25]
Y. Tang, L. H. U, Y. Cai, N. Mamoulis, and R. Cheng. Earth mover's distance based similarity search at scale. PVLDB, 7(4):313--324, 2013.
[26]
M. S. Uysal, C. Beecks, J. Schmücking, and T. Seidl. Efficient filter approximation using the Earth Mover's Distance in very large multimedia databases with feature signatures. In CIKM, pages 979--988, 2014.
[27]
M. S. Uysal, C. Beecks, and T. Seidl. On efficient query processing with the earth mover's distance. In PIKM@CIKM, pages 25--32, 2014.
[28]
M. Wichterich, I. Assent, P. Kranen, and T. Seidl. Efficient emd-based similarity search in multimedia databases via flexible dimensionality reduction. In SIGMOD, pages 199--212, 2008.
[29]
J. Xu, Z. Zhang, A. K. H. Tung, and G. Yu. Efficient and effective similarity search over probabilistic data based on earth mover's distance. PVLDB, 3(1):758--769, 2010.
[30]
P. Zezula, G. Amato, V. Dohnal, and M. Batko. Similarity Search - The Metric Space Approach, volume 32 of Advances in Database Systems. 2006.

Cited By

View all
  • (2022)A novel unsupervised multiple feature hashing for image retrieval and indexing (MFHIRI)Journal of Visual Communication and Image Representation10.1016/j.jvcir.2022.10346784(103467)Online publication date: Apr-2022
  • (2019)A survey of image data indexing techniquesArtificial Intelligence Review10.1007/s10462-018-9673-852:2(1189-1266)Online publication date: 1-Aug-2019
  • (2017)When content-based video retrieval and human computation unite: Towards effective collaborative video search2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)10.1109/ICMEW.2017.8026262(214-219)Online publication date: Jul-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SSDBM '15: Proceedings of the 27th International Conference on Scientific and Statistical Database Management
June 2015
390 pages
ISBN:9781450337090
DOI:10.1145/2791347
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 June 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. earth mover's distance
  2. feature signatures
  3. lower bound
  4. scientific databases

Qualifiers

  • Research-article

Funding Sources

  • DFG

Conference

SSDBM 2015

Acceptance Rates

Overall Acceptance Rate 56 of 146 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)A novel unsupervised multiple feature hashing for image retrieval and indexing (MFHIRI)Journal of Visual Communication and Image Representation10.1016/j.jvcir.2022.10346784(103467)Online publication date: Apr-2022
  • (2019)A survey of image data indexing techniquesArtificial Intelligence Review10.1007/s10462-018-9673-852:2(1189-1266)Online publication date: 1-Aug-2019
  • (2017)When content-based video retrieval and human computation unite: Towards effective collaborative video search2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)10.1109/ICMEW.2017.8026262(214-219)Online publication date: Jul-2017
  • (2016)Geometric Graph Indexing for Similarity Search in Scientific DatabasesProceedings of the 28th International Conference on Scientific and Statistical Database Management10.1145/2949689.2949691(1-12)Online publication date: 18-Jul-2016
  • (2016)Approximation-Based Efficient Query Processing with the Earth Mover's DistanceProceedings, Part II, of the 21st International Conference on Database Systems for Advanced Applications - Volume 964310.1007/978-3-319-32049-6_11(165-180)Online publication date: 16-Apr-2016
  • (2015)Large-scale Efficient and Effective Video Similarity SearchProceedings of the 2015 Workshop on Large-Scale and Distributed System for Information Retrieval10.1145/2809948.2809950(3-8)Online publication date: 22-Oct-2015
  • (2015)Gradient-based Signatures for Efficient Similarity Search in Large-scale Multimedia DatabasesProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806459(1241-1250)Online publication date: 17-Oct-2015
  • (2015)Earth Mover's Distance vs. Quadratic form Distance: An Analytical and Empirical Comparison2015 IEEE International Symposium on Multimedia (ISM)10.1109/ISM.2015.76(233-236)Online publication date: Dec-2015
  • (2015)Effective Content-Based Near-Duplicate Video Detection2015 IEEE International Symposium on Multimedia (ISM)10.1109/ISM.2015.60(254-257)Online publication date: Dec-2015
  • (2015)Endoscopic Video Retrieval: A Signature-Based Approach for Linking Endoscopic Images with Video Segments2015 IEEE International Symposium on Multimedia (ISM)10.1109/ISM.2015.21(33-38)Online publication date: Dec-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media