skip to main content
10.1145/1900008.1900076acmconferencesArticle/Chapter ViewAbstractPublication Pagesacm-seConference Proceedingsconference-collections
research-article

Towards improving a similarity search approach

Published: 15 April 2010 Publication History

Abstract

In this paper, we present continuous research on data analysis based on our previous work on similarity search problems. PanKNN [13] is a novel technique which explores the meaning of K nearest neighbors from a new perspective, redefines the distances between data points and a given query point Q, and efficiently and effectively selects data points which are closest to Q. It can be applied in various data mining fields. In this paper, we present our approach to improving the PanKNN algorithm using the Shrinking concept. Shrinking[15] is a data preprocessing technique which optimizes the inner structure of data inspired by the Newton's Universal Law of Gravitation[11] in the real world. This improved approach can assist to improve the performance of existing data analysis approaches.

References

[1]
White D. A. and Jain R. Similarity Indexing with the SS-tree. In Proceedings of the 12th Intl. Conf. on Data Engineering, pages 516--523, New Orleans, Louisiana, February 1996.
[2]
E. Achtert, C. Böhm, P. Kröger, P. Kunath, A. Pryakhin, and M. Renz. Efficient reverse k-nearest neighbor search in arbitrary metric spaces. In SIGMOD '06, pages 515--526, New York, NY, USA, 2006. ACM.
[3]
C. C. Aggarwal. Towards meaningful high-dimensional nearest neighbor search by human-computer interaction. In ICDE, 2002.
[4]
C. C. Aggarwal, A. Hinneburg, and D. A. Keim. On the surprising behavior of distance metrics in high dimensional space. Lecture Notes in Computer Science, 1973, 2001.
[5]
D. A. Berchtold S., Keim and H.-P. Kriegel. The X-tree: An index structure for high-dimensional data. In VLDB'96, pages 28--39, Bombay, India, 1996.
[6]
K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is "nearest neighbor" meaningful? In International Conference on Database Theory 99, pages 217--235, Jerusalem, Israel, 1999.
[7]
B. Cui, H. Shen, J. Shen, and K. Tan. Exploring bit-difference for approximate KNN search in high-dimensional databases. In Australasian Database Conference, 2005., 2005.
[8]
R. Fagin, R. Kumar, and D. Sivakumar. Efficient similarity search and classification via rank aggregation, 2003.
[9]
A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In The VLDB Journal, pages 518--529, 1999.
[10]
A. Hinneburg, C. C. Aggarwal, and D. A. Keim. What is the nearest neighbor in high dimensional spaces? In The VLDB Journal, pages 506--515, 2000.
[11]
Rothman, Milton A. The laws of physics. New York, Basic Books, 1963.
[12]
T. Seidl and H.-P. Kriegel. Optimal multi-step k-nearest neighbor search. SIGMOD Rec., 27(2):154--165, 1998.
[13]
Y. Shi and L. Zhang. A dimension-wise approach to similarity search problems. In the 4th International Conference on Data Mining (DMIN'08), 2008.
[14]
R. Weber, H.-J. Schek, and S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In Proc. 24th Int. Conf. Very Large Data Bases, VLDB, pages 194--205, 24--27 1998.
[15]
Yong Shi, Yuqing Song and Aidong Zhang. A shrinking-based clustering approach for multidimensional data. In IEEE Transactions on Knowledge and Data Engineering (TKDE), pages 1389--1403, 2005.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ACMSE '10: Proceedings of the 48th annual ACM Southeast Conference
April 2010
488 pages
ISBN:9781450300643
DOI:10.1145/1900008
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 April 2010

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ACM SE '10
Sponsor:
ACM SE '10: ACM Southeast Regional Conference
April 15 - 17, 2010
Mississippi, Oxford

Acceptance Rates

ACMSE '10 Paper Acceptance Rate 48 of 94 submissions, 51%;
Overall Acceptance Rate 502 of 1,023 submissions, 49%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 75
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media