research-article

Real-life performance of metric searching

Authors:
Vlastislav Dohnal

Masaryk University, Brno, Czech Republic

Masaryk University, Brno, Czech Republic
View Profile

,
Pavel Zezula

Masaryk University, Brno, Czech Republic

Masaryk University, Brno, Czech Republic
View Profile

Authors Info & Claims

SIGSPATIAL Special Volume 2 Issue 2July 2010pp 28–31https://doi.org/10.1145/1862413.1862421

Published:01 July 2010Publication History

SIGSPATIAL Special

Abstract

Similarity is a central notion throughout human lives and it will soon become the prevalent strategy for dealing with digital content also in computer systems. But the exponential growth of data makes the scalability and performance issues serious matters of concern. Contemporary decentralized media of mass communication allowing cooperative and collaborative practices enable users autonomously contribute to production of global media, whose elements are in fact related by numerous multi-facet links of similarity. As an example, consider the sites like Flickr, YouTube, or Facebook that host user-contributed heterogeneous content for a variety of events. Accordingly, the core ability of future data processing systems is the similarity management of large and ever growing volumes of data. In a simplified way, the real-life performance can be constrained from two points of view: (1) the query response time, and (2) the query execution throughput, i.e. the number of queries processed per a unit of time. Typically, the query response time should be on-line, say less than one second, but the query execution throughput can even be expected in hundreds or thousands in case of large-scale web applications.

References

}}M. Batko, D. Novak, F. Falchi, and P. Zezula. On scalability of the similarity search in the world of peers. In INFOSCALE, pages 1--12. ACM, 2006. Google ScholarDigital Library
}}P. Ciaccia, M. Patella, and P. Zezula. M-tree: An efficient access method for similarity search in metric spaces. In VLDB, pages 426--435. Morgan Kaufmann, 1997. Google ScholarDigital Library
}}J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Comm. ACM, 51(1):107--113, 2008. Google ScholarDigital Library
}}V. Dohnal, C. Gennaro, P. Savino, and P. Zezula. D-Index: Distance searching index for metric data sets. Multimedia Tools and Applications, 21(1):9--33, 2003. Google ScholarDigital Library
}}C. Doulkeridis, A. Vlachou, Y. Kotidis, and M. Vazirgiannis. Efficient range query processing in metric spaces over highly distributed data. Distributed and Parallel Databases, 26(2--3):155--180, 2009. Google ScholarDigital Library
}}I. King, C. H. Ng, and K. C. Sia. Distributed content-based visual information retrieval system on peer-to-peer networks. ACM TOIS, 22(3):477--501, 2004. Google ScholarDigital Library
}}J. Lin. Brute force and indexed approaches to pairwise document similarity comparisons with mapreduce. In SIGIR, pages 155--162. ACM, 2009. Google ScholarDigital Library
}}D. Novak, M. Batko, and P. Zezula. Generic similarity search engine demonstrated by an image retrieval application. In SIGIR, page 840. ACM, 2009. Google ScholarDigital Library
}}H. Samet. Foundations of Multidimensional And Metric Data Structures. Series in Data Management Systems. Morgan Kaufmann, 2006. Google ScholarDigital Library
}}J. Sedmidubsky, S. Bartoň, V. Dohnal, and P. Zezula. Adaptive approximate similarity searching through metric social networks. In ICDE, pages 1424--1426. IEEE, 2008. Google ScholarDigital Library
}}T. Skopal. Pivoting M-tree: A metric access method for efficient similarity search. In DATESO, volume 98. Technical University of Aachen, 2004.Google Scholar
}}C. Traina, Jr., A. J. M. Traina, B. Seeger, and C. Faloutsos. Slim-Trees: High performance metric trees minimizing overlap between nodes. In EDBT, volume 1777 of Lecture Notes in Computer Science, pages 51--65. Springer, 2000. Google ScholarDigital Library
}}R. Vernica, M. J. Carey, and C. Li. Efficient parallel set-similarity joins using mapreduce. In SIGMOD, pages 495--506. ACM, 2010. Google ScholarDigital Library
}}P. Zezula, G. Amato, V. Dohnal, and M. Batko. Similarity Search: The Metric Space Approach, volume 32 of Advances in Database Systems. Springer, 2005. Google ScholarDigital Library
}}P. Zezula, P. Savino, F. Rabitti, G. Amato, and P. Ciaccia. Processing M-trees with parallel resources. In RIDE, pages 147--154. IEEE, 1998. Google ScholarDigital Library

Index Terms

Real-life performance of metric searching

Recommendations

View selection for real conjunctive queries

Given a query workload, a database and a set of constraints, the view-selection problem is to select views to materialize so that the constraints are satisfied and the views can be used to compute the queries in the workload efficiently. A typical ...
Read More
Multi-metric Graph Query Performance Prediction
Database Systems for Advanced Applications
Abstract
We propose a general framework for predicting graph query performance with respect to three performance metrics: execution time, query answer quality, and memory consumption. The learning framework generates and makes use of informative statistics ...
Read More
Searching the deep web using proactive phrase queries
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

This paper proposes ipq, a novel search engine that proactively transforms query forms of Deep Web sources into phrase queries, constructs query evaluation plans, and caches results for popular queries offline. Then at query time, keyword queries are ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

SIGSPATIAL Special Volume 2, Issue 2
July 2010
38 pages
EISSN:1946-7729
DOI:10.1145/1862413
Issue’s Table of Contents

Copyright © 2010 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 July 2010
Check for updates
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 113
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Real-life performance of metric searching

SIGSPATIAL Special

Abstract

References

Cited By

Index Terms

Recommendations

View selection for real conjunctive queries

Multi-metric Graph Query Performance Prediction

Searching the deep web using proactive phrase queries

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Real-life performance of metric searching

SIGSPATIAL Special

Abstract

References

Cited By

Index Terms

Recommendations

View selection for real conjunctive queries

Multi-metric Graph Query Performance Prediction

Searching the deep web using proactive phrase queries

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media