Skip to main content

Challenges in Using Peer-to-Peer Structures in Order to Design a Large-Scale Web Search Engine

  • Conference paper
Advances in Computer Science and Engineering (CSICC 2008)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 6))

Included in the following conference series:

  • 665 Accesses

Abstract

One of the distributed solutions for scaling Web Search Engines (WSEs) may be peer-to-peer (P2P) structures. P2P structures are successfully being used in many systems with lower cost than ordinary distributed solutions. However, the fact that they can also be beneficial for large-scale WSEs is still a controversial subject. In this paper, we introduce challenges in using P2P structures to design a large-scale WSE. Considering different types of P2P systems, we introduce possible P2P models for this purpose. Using some quantitative evaluation, we compare these models from different aspects to find out which one is the best in order to construct a large-scale WSE. Our studies indicate that traditional P2P structures are not good choices in this area and the best model may be the use of a special case of Super-Peer Networks which is yet conditioned on the peers’ active and trustful contributions.

You can find the complete version of this paper in the first author’s website.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brewington, B.E., Cybenko, G.: How dynamic is the Web? In: Procs of 9th International World-Wide Web Conference (May 2000)

    Google Scholar 

  2. Cyveillance. Sizing the internet. White paper (July 2000), http://www.cyveillance.com/

  3. Lyman, P., Varian, H.R., Charles, P., Good, N., Jordan, L.L., Pal, J.: How much information? (2003)

    Google Scholar 

  4. Li, J., Loo, B.T., Hellerstein, J., Kaashoek, F., Karger, D.R., Morris, R.: On the feasibility of P2P Web indexing and search. In: Procs of the 2nd Int. Workshop on P2P Systems (2003)

    Google Scholar 

  5. Ye, S., Lu, G., Li, X.: Workload-aware Web crawling and server workload detection. In: Network Research Workshop, 18th Asian Pacific Advanced Network Meeting (July 2004)

    Google Scholar 

  6. Cho, J., Garcia-Molina, H.: The evolution of the Web and implications for an incremental crawler. In: Procs of 26th International Conference on VLDB, Cairo, Egypt, pp. 200–209 (2000)

    Google Scholar 

  7. Wu, L.S., Akavipat, R., Menczer, F.: 6S: Distributing crawling and searching across Web peers. Web Technologies, Applications, and Services, pp. 159–164 (2005)

    Google Scholar 

  8. Papapetrou, O., Samaras, G.: Distributed location aware Web crawling. WWW (Alternate Track Papers & Posters,) pp. 468–469 (2004)

    Google Scholar 

  9. Wang, Y., DeWitt, D.: Computing PageRank in a distributed internet search system. In: Procs of the International Conference on Very Large Databases (August 2004)

    Google Scholar 

  10. Suel, T., Mathur, C., Wu, J., Zhang, J., Delis, A., Kharrazi, M., Long, X., Shanmugasunderam, K.: Odissea: A peer-to-peer architecture for scalable Web search and information retrieval. Technical Report, Polytechnic University (2003)

    Google Scholar 

  11. Sankaralingam, K., Sethumadhavan, S., Browne, J.C.: Distributed PageRank for p2p systems. In: Procs of the 12th IEEE International Symposium on High Performance Distributed Computing, Seattle, Washington, USA (June 2003)

    Google Scholar 

  12. Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. In: Procs of the 7th World Wide Web Conference, vol. 30(1/7), pp. 107–117 (1998)

    Google Scholar 

  13. Mousavi, H., Rafiei, M., Movaghar, A.: Characterizing the Web Using a New Uniform Sam-pling Approach. In: Procs. of Comsware 2007, India (2007)

    Google Scholar 

  14. Tang, C., Xu, Z., Mahalingam, M.: pSearch: Information retrieval in structured overlays. In: First Workshop on Hot Topics in Networks (HotNets I), Princeton, NJ (October 2002)

    Google Scholar 

  15. Dikaiakos, M., Stassopoulou, A., Papageorgiou, L.: An investigation of Web crawler behavior: characterization and metrics. Computer Communications 28(8), 880–897 (2005)

    Article  Google Scholar 

  16. Gulli, A., Signorini, A.: The indexable Web is more than 11.5 billion pages. In: WWW (Special interest tracks and posters), pp. 902–903 (2005)

    Google Scholar 

  17. Silverstein, C., Henzinger, M., Marais, H., Moricz, M.: Analysis of a very large Web search engine query log. SIGIR Forum 33(1), 6–12 (1999)

    Article  Google Scholar 

  18. Balke, W.T., Nejdl, W., Siberski, W., Thaden, U.: Progressive distributed top-k retrieval in peer-to-peer networks. In: Procs. of 21st Int. Conf. on Data Engineering, Tokyo (2005)

    Google Scholar 

  19. Craswell, N., Crimmins, F., Hawking, D., Moffat, A.: Performance and cost tradeoffs in Web search. In: Procs. of the Australasian Database Conference ADC 2004 (2004)

    Google Scholar 

  20. The Search Engine Watch Website, http://www.searchenginewatch.com

  21. http://cia.gov/cia/publication/factbook

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mousavi, H., Movaghar, A. (2008). Challenges in Using Peer-to-Peer Structures in Order to Design a Large-Scale Web Search Engine. In: Sarbazi-Azad, H., Parhami, B., Miremadi, SG., Hessabi, S. (eds) Advances in Computer Science and Engineering. CSICC 2008. Communications in Computer and Information Science, vol 6. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89985-3_57

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89985-3_57

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89984-6

  • Online ISBN: 978-3-540-89985-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics