Abstract.
Similarity queries on complex objects are usually translated into searches among their feature vectors. This paper studies indexing techniques for very high-dimensional (e.g., in hundreds) vectors that are sparse or quasi-sparse, i.e., vectors each having only a small number (e.g., ten) of non-zero or significant values. Based on the R-tree, the paper introduces the xS-tree that uses lossy compression of bounding regions to guarantee a reasonable minimum fan-out within the allocated storage space for each node. In addition, the paper studies the performance and scalability of the xS-tree via experiments.
Similar content being viewed by others
Author information
Authors and Affiliations
Additional information
Received: 3 May 1999 / Accepted: 23 October 2000 Published online: 27 April 2001
Rights and permissions
About this article
Cite this article
Wang, C., Wang, X. Indexing very high-dimensional sparse and quasi-sparse vectors for similarity searches. The VLDB Journal 9, 344–361 (2001). https://doi.org/10.1007/s007780100036
Issue Date:
DOI: https://doi.org/10.1007/s007780100036