Indexing very high-dimensional sparse and quasi-sparse vectors for similarity searches

Wang, Changzhou; Wang, X. Sean

doi:10.1007/s007780100036

Indexing very high-dimensional sparse and quasi-sparse vectors for similarity searches

Regular contribution
Published: April 2001

Volume 9, pages 344–361, (2001)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

Changzhou Wang¹ &
X. Sean Wang²

89 Accesses
7 Citations
Explore all metrics

Abstract.

Similarity queries on complex objects are usually translated into searches among their feature vectors. This paper studies indexing techniques for very high-dimensional (e.g., in hundreds) vectors that are sparse or quasi-sparse, i.e., vectors each having only a small number (e.g., ten) of non-zero or significant values. Based on the R-tree, the paper introduces the xS-tree that uses lossy compression of bounding regions to guarantee a reasonable minimum fan-out within the allocated storage space for each node. In addition, the paper studies the performance and scalability of the xS-tree via experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Author information

Authors and Affiliations

Mathematics and Computing Technology, Phantom Works, The Boeing Company, Bellevue, Washington, USA; E-mail: changzhou.wang@boeing.com , , , , , , US
Changzhou Wang
Department of Information and Software Engineering, George Mason University, Fairfax, Virginia, USA; E-mail: xywang@gmu.edu , , , , , , US
X. Sean Wang

Authors

Changzhou Wang
View author publications
You can also search for this author in PubMed Google Scholar
X. Sean Wang
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Received: 3 May 1999 / Accepted: 23 October 2000 Published online: 27 April 2001

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, C., Wang, X. Indexing very high-dimensional sparse and quasi-sparse vectors for similarity searches. The VLDB Journal 9, 344–361 (2001). https://doi.org/10.1007/s007780100036

Download citation

Issue Date: April 2001
DOI: https://doi.org/10.1007/s007780100036

Key words: Similarity search – High-dimensional indexing structure – Sparse vector – Quasi-sparse vector – Lossy compression

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Indexing very high-dimensional sparse and quasi-sparse vectors for similarity searches

Abstract.

Access this article

Similar content being viewed by others

K-Means algorithm based on multi-feature-induced order

Similarity encoding for learning with dirty categorical variables

Data dependencies for query optimization: a survey

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Navigation

Indexing very high-dimensional sparse and quasi-sparse vectors for similarity searches

Abstract.

Access this article

Similar content being viewed by others

K-Means algorithm based on multi-feature-induced order

Similarity encoding for learning with dirty categorical variables

Data dependencies for query optimization: a survey

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation