Abstract
Nowadays the nearest neighbor (NN) search in the high dimensional space can be applied in many fields and it becomes the focus of information science. Usually, R-near neighbor that sets a fixed query range R is used in place of NN search. However, the traditional methods for R-near neighbor can not achieve the satisfactory performance in the high dimensional space due to the curse of dimensionality. Moreover, some methods is based on probabilistic guarantees so it does not provide the 100 % accuracy guarantee. To improve the problem, in this paper, we propose a novel idea to build the index structure. This method is based on the mathematical features of the coordinates of the data points. Specifically, we employ the mean value and the standard deviation of the coordinate to index the data point. This method can efficiently solve the R-NN search with the 100 % accuracy guarantee in the high dimensional space. Extensive experimental results demonstrate the effectiveness of the proposed methods.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM Special Interest Group on Management of Data(SIGMOD), pp. 47–57 (1984)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the Annual ACM Symposium on Theory of Computing, pp. 604–613 (1998)
Jagadish, H.V., Ooi, B.C., Tan, K.L., Yu, C., Zhang, R.: Idistance: an adaptive \(B^{+}\)-tree based Indexing method for nearest neighbor search. ACM Trans. Database Syst. 30(2), 364–397 (2005)
Berchtold, S., Bohm, C., Kriegel, H.-P.: The pyramid-technique: towards indexing beyond the curse of dimensionality. In: Proceedings of the ACM SIGMOD, pp. 142–153 (1998)
Zhuang, Y.T., Yang, Y., Wu, F.: Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval. IEEE Trans. Multimedia 10(2), 221–229 (2008)
Weber, R., Schek, H.J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of International Conference on Very Large Databases, pp. 194–205 (1998)
Lawder, J.K., King, P.J.H.: Using space-filling curves for multi-dimensional indexing. In: Jeffery, K., Lings, B. (eds.) BNCOD 2000. LNCS, vol. 1832, pp. 20–35. Springer, Heidelberg (2000)
Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of International Conference on Very Large Databases, pp. 426–435 (1997)
Beckmann, N., Kriegel, R. Schneider Seeger, B.: The \(R^{*}\)-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the ACM SIGMOD, pp. 322–331 (1990)
Sellis, T., Roussopoulos, N., Faloutsos, C.: The \(R^{+}\)-tree: a dynamic index for multidimensional objects. In: Proceedings of International Conference on Very Large Databases, pp. 507–518 (1987)
Bohm, C.: A cost model for query processing in high-dimensional data. ACM Trans. Database Syst. 25, 129–178 (2000)
Robinson, J.: The K-D-B-tree: a search structure for large multidimensional dynamic indexes. In: Proceedings of the ACM SIGMOD, pp. 10–18 (1981)
Jinyang, H.V., Jagadish, W.L., Ooi, B.C.: DSH: data sensitive hashing for high-dimensional k-NN search. In: Proceedings of the ACM SIGMOD, pp. 1127–1138 (2014)
Tao, Y., Yi, K., Sheng, C., Kalnis, P.: Quality and efficiency in high dimensional nearest neighbor search. In: Proceedings of the ACM SIGMOD, pp. 563–576 (2009)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, Y., Li, J., Yuan, Y. (2016). A Novel High-Dimensional Index Method Based on the Mathematical Features. In: Wang, Y., Yu, G., Zhang, Y., Han, Z., Wang, G. (eds) Big Data Computing and Communications. BigCom 2016. Lecture Notes in Computer Science(), vol 9784. Springer, Cham. https://doi.org/10.1007/978-3-319-42553-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-42553-5_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42552-8
Online ISBN: 978-3-319-42553-5
eBook Packages: Computer ScienceComputer Science (R0)