Abstract
We present a novel graph-based approach for fast similarity searches suitable for large-scale and high-dimensional data sets. We focus on a well-known feature of small-world networks, they are “searchable,” and propose an efficient index structure called a degree-reduced nearest neighbor graph. A similarity search is then formulated as a problem of finding the most similar object to a query object by following the links in this graph with a best-first neighborhood search algorithm. The experimental results show that the proposed search method significantly reduces search costs. In particular, we apply it to data sets consisting of nearly one million documents, and successfully reduce the average number of similarity evaluations to only 0.9% of the total number of documents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)
Kleinberg, J.: Complex networks and decentralized search algorithms. In: Proc. Int. Congress of Mathematicians (2006)
Watts, D.J., Dodds, P.S., Newman, M.E.J.: Identity and search in social networks. Science 296, 1302–1305 (2002)
Milgram, S.: The small world problem. Psychology Today 2, 60–67 (1967)
Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Comp. Surveys 33, 273–321 (2001)
Kleinberg, J.: The small-world phenomenon: an algorithmic perspective. In: Proc. ACM Symp. Theory of Computing, pp. 163–170 (2000)
Orchard, M.T.: A fast nearest-neighbor search algorithm. Proc. Int. Conf. Acoust., Speech, Signal Process. 4, 2297–2300 (1992)
Sebastian, T.B., Kimia, B.B.: Metric-based shape retrieval in large databases. In: Proc. Int. Conf. Pattern Recognition, vol. 3, pp. 291–296 (2002)
Adamic, L.A., Lukose, R.M., Puniyani, A.R., Huberman, B.A.: Search in power-law networks. Phys. Rev. E 64, 046135 (2001)
Androutsos, P., Androutsos, D., Venetsanopoulos, A.N.: Small world distributed access of multimedia data: An indexing system that mimics social acquaintance networks. IEEE Signal Processing Magazine 23, 142–153 (2006)
Lin, C.-J., Tsai, S.-C., Chang, Y.-T., Chou, C.-F.: Enabling keyword search and similarity search in small-world-based P2P systems. In: Proc. 16th Int. Conf. on Computer Communications and Networks, pp. 115–120 (2007)
Andoni, A., Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.: Locality-sensitive hashing using stable distributions. In: Nearest-neighbor methods in learning and vision. MIT Press, Cambridge (2005)
Bustos, B., Navarro, G., Chávez, E.: Pivot selection techniques for proximity searching in metric spaces. Pattern Recog. Lett. 24, 2357–2366 (2003)
Şimşek, Ö., Jensen, D.: Decentralized search in networks using homophily and degree disparity. In: Proc. 19th Int. Joint Conf. on Artificial Intelligence, pp. 304–310 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Aoyama, K., Saito, K., Yamada, T., Ueda, N. (2009). Fast Similarity Search in Small-World Networks. In: Fortunato, S., Mangioni, G., Menezes, R., Nicosia, V. (eds) Complex Networks. Studies in Computational Intelligence, vol 207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01206-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-01206-8_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01205-1
Online ISBN: 978-3-642-01206-8
eBook Packages: EngineeringEngineering (R0)