Abstract
We present the InfoBeacons system, in which a peer-to-peer network of beacons cooperates to route queries to the best information sources. The routing in our system uses techniques adapted from information retrieval. We examine routing at two levels. First, each beacon is assigned several sources and routes queries to those sources. Many sources are unwilling to provide more cooperation than simple searching, and we must adapt traditional information retrieval techniques to choose the best sources despite this lack of cooperation. Second, beacons route queries to other beacons using techniques similar to those for routing queries to sources. We examine alternative architectures for routing queries between beacons. Results of experiments using a beacon network to search 1,000 information sources demonstrates how our techniques can be used to efficiently route queries; for example, our techniques require contacting up to 70 percent fewer sources than random walk techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adamic, L., Lukose, R., Puniyani, A., Huberman, B.: Search in power-law networks. Phys. Rev. E 64, 46135–46143 (2001)
Barish, G., Obraczka, K.: World wide web caching: Trends and techniques. IEEE Communications Magazine (May 2000)
Bawa, M., Bayardo Jr., R.J., Rajagopalan, S., Shekita, E.: Make it fresh, make it quick — searching a network of personal webservers. In: Proc. WWW (2003)
Bernstein, P.A., Giunchiglia, F., Kementsietsidis, A., Mylopoulos, J., Serafini, L., Zaihrayeu, I.: Data management for peer-to-peer computing: A vision. In: Proc. WebDB (2002)
Bhattacharjee, B.: Efficient peer-to-peer searches using result-caching. In: Proc. of the IPTPS (2003)
Bowman, C.M., Danzig, P.B., Hardy, D.R., Manber, U., Schwartz, M.F.: The Harvest information discovery and access system. In: Proc. 2nd WWW Conference (1994)
Cahoon, B., McKinley, K.S., Lu, Z.: Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. ACM Transactions on Information Systems 18(1), 1–43 (2000)
Callan, J.P., Connell, M.E.: Query-based sampling of text databases. ACM TOIS 19(2), 97–130 (2001)
Chawathe, S., Garcia-Molina, H., Hammer, J., Ireland, K., Papakonstantinou, Y., Ullman, J., Widom, J.: The TSIMMIS project: Integration of heterogeneous information sources. In: Proc. of IPSJ Conference (October 1994)
Chawathe, Y., Ratnasamy, S., Breslau, L., Lanham, N., Shenker, S.: Making Gnutella-like P2P systems scalable. In: Proc. ACM SIGCOMM (2003)
Cooper, B.F.: Guiding users to information sources with InfoBeacons. In: Proc. ACM/IFIP/USENIX 5th International Middleware Conference (2004)
Franklin, M.J., Carey, M.J.: Client-server caching revisited. In: Proc. Int’l Workshop on Distributed Object Management (1992)
French, J.C., Powell, A.L., Callan, J., Viles, C.L., Emmitt, T., Prey, K.J., Mou, Y.: Comparing the performance of database selection algorithms. In: Proc. SIGIR (1999)
Fuhr, N.: A decision-theoretic approach to database selection in networked IR. ACM TOIS 17(3), 229–249 (1999)
Galanis, L., Wang, Y., Jeffrey, S.R., DeWitt, D.J.: Locating data sources in large distributed systems. In: Proc. VLDB (2003)
Gravano, L., Garcia-Molina, H., Tomasic, A.: GlOSS: Text-source discovery over the internet. ACM TODS 24(2), 229–264 (1999)
Halevy, A.Y., Ives, Z.G., Mork, P., Tatarinov, I.: Piazza: Data management infrastructure for semantic web applications. In: Proc. WWW (2003)
Huebsch, R., Hellerstein, J.M., Lanham, N., Loo, B.T., Shenker, S., Stoica, I.: Querying the Internet with PIER. In: Proc. VLDB (2003)
Ipeirotis, P., Gravano, L.: Distributed search over the hidden web: Hierarchical database sampling and selection. In: Proc. VLDB (2002)
Kalogeraki, V., Gunopulos, D., Zeinalipour-Yazti, D.: A local search mechanism for peer-to-peer networks. In: Proc. CIKM (2002)
Loo, B.T., Huebsch, R., Stoica, I., Hellerstein, J.M.: The case for a hybrid P2P search infrastructure. In: Proc. International Workshop on Peer-to-Peer Systems (2004)
Lu, Z., McKinley, K.S.: Partial collection replication versus caching for information retrieval systems. In: Proc. SIGIR (2000)
Lv, Q., Cao, P., Cohen, E., Li, K., Shenker, S.: Search and replication in unstructured peer-to-peer networks. In: Proc. of ACM Int’l Conf. on Supercomputing (ICS 2002) (June 2002)
Nejdl, W., Wolpers, M., Siberski, W., Schmitz, C., Schlosser, M., Brunkhorst, I., Loser, A.: Super-peer-based routing and clustering strategies for RDF-based peer-to-peer networks. In: Proc. WWW (2003)
Page, L., Brin, S.: The anatomy of a large-scale hypertext web search engine. In: Proc. WWW (1998)
Reynolds, P., Vahdat, A.: Efficient peer-to-peer keyword searching. In: Proc. ACM/IFIP/USENIX International Middleware Conference (2003)
Shi, S., Yang, G., Wang, D., Yu, J., Qu, S., Chen, M.: Making peer-to-peer keyword searching feasible using multilevel partitioning. In: Proc. International Workshop on Peer-to-Peer Systems (2004)
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: Proc. SIGCOMM (August 2001)
Sugiura, A., Etzioni, O.: Query routing for web search engines: Architecture and experiments. In: Proc. WWW (2000)
Tang, C., Xu, Z., Dwarkadas, S.: Peer-to-peer information retrieval using self-organizing semantic overlay networks. In: Proc. SIGCOMM (2003)
Wang, J., Lochovsky, F.: Data extraction and label assignment for web databases. In: Proc. WWW (2003)
Yang, B., Garcia-Molina, H.: Efficient search in peer-to-peer networks. In: Proc. Int’l Conf. on Distributed Computing Systems (ICDCS) (July 2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cooper, B.F. (2005). Using Information Retrieval Techniques to Route Queries in an InfoBeacons Network. In: Ng, W.S., Ooi, BC., Ouksel, A.M., Sartori, C. (eds) Databases, Information Systems, and Peer-to-Peer Computing. DBISP2P 2004. Lecture Notes in Computer Science, vol 3367. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31838-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-31838-5_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25233-7
Online ISBN: 978-3-540-31838-5
eBook Packages: Computer ScienceComputer Science (R0)