Article

Approximate searches: k-neighbors + precision

Authors:
Sid-Ahmed Berrani

IRISA, Cesson-Sévigné France

IRISA, Cesson-Sévigné France
View Profile

,
Laurent Amsaleg

CNRS -- IRISA, Rennes, France

CNRS -- IRISA, Rennes, France
View Profile

,
Patrick Gros

CNRS -- IRISA, Rennes, France

CNRS -- IRISA, Rennes, France
View Profile

CIKM '03: Proceedings of the twelfth international conference on Information and knowledge managementNovember 2003Pages 24–31https://doi.org/10.1145/956863.956870

Published:03 November 2003Publication History

CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management

Pages 24–31

ABSTRACT

It is known that all multi-dimensional index structures fail to accelerate content-based similarity searches when the feature vectors describing images are high-dimensional. It is possible to circumvent this problem by relying on approximate search-schemes trading-off result quality for reduced query execution time. Most approximate schemes, however, provide none or only complex control on the precision of the searches, especially when retrieving the k nearest neighbors (NNs) of query points.In contrast, this paper describes an approximate search scheme for high-dimensional databases where the precision of the search can be probabilistically controlled when retrieving the k NNs of query points. It allows a fine and intuitive control over this precision by setting at run time the maximum probability for a vector that would be in the exact answer set to be missed in the approximate set of answers eventually returned. This paper also presents a performance study of the implementation using real datasets showing its reliability and efficiency. It shows, for example, that our method is 6.72 times faster than the sequential scan when it handles more than 5 10₆ 24-dimensional vectors, even when the probability of missing one of the true nearest neighbors is below 0.01.

References

L. Amsaleg and P. Gros. Content-based retrieval using local descriptors: Problems and issues from a database perspective. Pattern Analysis and Applications, Special Issue on Image Indexation, 4:108--124, 2001.]]Google Scholar
L. Amsaleg, P. Gros, and S.-A. Berrani. A robust technique to recognize objects in images, and the db problems it raises. In Proceedings of the 7th International Workshop on Multimedia Information Systems, Capri, Italy, November 2001.]]Google Scholar
L. Amsaleg, P. Gros, and S.-A. Berrani. Robust object recognition in images and the related database problems. Special issue of the Journal of Multimedia Tools and Applications, 2003 (To appear).]] Google ScholarDigital Library
K. P. Bennett, U. Fayyad, and D. Geiger. Density-based indexing for approximate nearest-neighbor queries. In Proceedings of the 5th acm International Conference on Knowledge Discovery and Data Mining, San Diego, California, USA, pages 233--243, August 1999.]] Google ScholarDigital Library
K. S. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is "nearest neighbor" meaningful? In Proceedings of the 7th International Conference on Database Theory, Jerusalem, Israel, pages 217--235. Springer, January 1999.]] Google ScholarDigital Library
C. B öhm, S. Berchtold, and D. A. Keim. Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. acm Computing Surveys, 33(3):322--373, September 2001.]] Google ScholarDigital Library
G. Brown. Modern Mathematics for the Engineer. 1956.]]Google Scholar
P. Ciaccia and M. Patella. Pac nearest neighbor queries: Approximate and controlled search in high-dimensional and metric spaces. In Proceedings of the 16th International Conference on Data Engineering, San Diego, California, USA, pages 244--255, February 2000.]] Google ScholarDigital Library
C. Faloutsos. Searching Multimedia Databases by Content. Kluwer Academic Publishers, 1996.]] Google ScholarDigital Library
H. Ferhatosmanoglu, E. Tuncel, D. Agrawal, and A. El Abbadi. Approximate nearest neighbor searching in multimedia databases. In Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany, pages 503--511, April 2001.]] Google ScholarDigital Library
L. Florack, B. ter Haar Romeny, J. Koenderink, and M. Viergever. General intensity transformation and differential invariants. Journal of Mathematical Imaging and Vision , 4(2):171--187, 1994.]]Google ScholarCross Ref
A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In Proceedings of the 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, UK, pages 518--529, September 1999.]] Google ScholarDigital Library
J. Goldstein and R. Ramakrishnan. Contrast plots and p-sphere trees: Space vs. time in nearest neighbor searches. In Proceedings of the 26th International Conference on Very Large Data Bases, Cairo, Egypt, pages 429--440, September 2000.]] Google ScholarDigital Library
D. Knuth. Art of Computer Programming, Volume 2: Seminumerical Algorithms, pages 135--136. 1997.]] Google ScholarDigital Library
C. Li, E. Chang, H. Garcia-Molina, and G. Wiederhold. Clustering for approximate similarity search in high-dimensional spaces. IEEE Transactions on Knowledge and Data Engineering, 14(4):792--808, July 2002.]] Google ScholarDigital Library
B.-U. Pagel, F. Korn, and C. Faloutsos. Deflating the dimensionality curse using multiple fractal dimensions. In Proceedings of the 16th International Conference on Data Engineering, San Diego, California, USA, pages 589--598, March 2000.]] Google ScholarDigital Library
C. Schmid and R. Mohr. Local grayvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5):530--534, May 1997.]] Google ScholarDigital Library
R. Weber and K. B öhm. Trading quality for time with nearest neighbor search. In Proceedings of the 7th Conference on Extending Database Technology, Konstanz, Germany, pages 21--35, March 2000.]] Google ScholarDigital Library
R. Weber, H.-J. Schek, and S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In Proceedings of the 24th International Conference on Very Large Data Bases, New York City, New York, USA , pages 194--205, August 1998.]] Google ScholarDigital Library
D. A. White and R. Jain. Similarity indexing with the ss-tree. In Proceedings of the 12th International Conference on Data Engineering, New Orleans, Louisiana, USA, pages 516--523, February 1996.]] Google ScholarDigital Library
T. Zhang, R. Ramakrishnan, and M. Livny. Birch: An efficient data clustering method for very large databases. In Proceedings of the acm sigmod International Conference on Management of Data, Montreal, Canada, pages 103--114, June 1996.]] Google ScholarDigital Library

Index Terms

Approximate searches: k-neighbors + precision
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing

Recommendations

On Approximate Nearest Neighbors under l∞ Norm

The nearest neighbor search (NNS) problem is the following: Given a set of n points P={p1, , pn} in some metric space X, preprocess P so as to efficiently answer queries which require finding a point in P closest to a query point q X. The approximate ...
Read More
Secure and efficient approximate nearest neighbors search
IH&MMSec '13: Proceedings of the first ACM workshop on Information hiding and multimedia security

This paper presents a moderately secure but very efficient approximate nearest neighbors search. After detailing the threats pertaining to the `honest but curious' model, our approach starts from a state-of-the-art algorithm in the domain of approximate ...
Read More
Fast and Accurate Handwritten Character Recognition Using Approximate Nearest Neighbours Search on Large Databases
Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition

In this work, fast approximate nearest neighbours search algorithms are shown to provide high accuracies, similar to those of exact nearest neighbour search, at a fraction of the computational cost in an OCR task. Recent studies [26,15] have shown the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management
November 2003
592 pages
ISBN:1581137230
DOI:10.1145/956863
General Chair:
Donald Kraft
Louisiana State University
,
Program Chairs:
Ophir Frieder
Illinois Institute of Technology
,
Joachim Hammer
University of Florida
,
Sajda Qureshi
University of Nebraska, Omaha
,
Len Seligman
The MITRE Corporation
Copyright © 2003 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 November 2003
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
approximate nearest-neighbor searches
multimedia databases
similarity searches
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 26
  Total Citations
  View Citations
- 721
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Approximate searches: k-neighbors + precision

CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

On Approximate Nearest Neighbors under l∞ Norm

Secure and efficient approximate nearest neighbors search

Fast and Accurate Handwritten Character Recognition Using Approximate Nearest Neighbours Search on Large Databases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Approximate searches: k-neighbors + precision

CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

On Approximate Nearest Neighbors under l∞ Norm

Secure and efficient approximate nearest neighbors search

Fast and Accurate Handwritten Character Recognition Using Approximate Nearest Neighbours Search on Large Databases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media