Skip to main content

Algorithmic Challenges in Web Search Engines

  • Conference paper
Experimental Algorithms (WEA 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4007))

Included in the following conference series:

Abstract

We present the main algorithmic challenges that large Web search engines face today. These challenges are present in all the modules of a Web retrieval system, ranging from the gathering of the data to be indexed (crawling) to the selection and ordering of the answers to a query (searching and ranking). Most of the challenges are ultimately related to the quality of the answer or the efficiency in obtaining it, although some are relevant even to the existence of current search engines: context based advertising.

As the Web grows and changes at a fast pace, the algorithms behind these challenges must rely in large scale experimentation, both in data volume and computation time, to understand the main issues that affect them. We show examples of our own research and of the state of the art. The full version of this paper appears in [1] .

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baeza-Yates, R.: Algorithmic Challenges in Web Search Engines. In: Correa, J.R., Hevia, A., Kiwi, M. (eds.) LATIN 2006. LNCS, vol. 3887, pp. 1–7. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval, p. 513. Addison-Wesley, England (1999)

    Google Scholar 

  3. Baeza-Yates, R.: Information Retrieval in the Web: beyond current search engines, Int. Journal of Approximate Reasoning 34(2-3), 97–104 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  4. Baeza-Yates, R., Castillo, C., Marin, M., Rodriguez, A.: Crawling a Country: Better Strategies than Breadth-First for Page Ordering. In: WWW 2005, Industrial Track, ACM Press, Chiba, Japan (2005)

    Google Scholar 

  5. Baeza-Yates, R.A., Hurtado, C.A., Mendoza, M.: Query Recommendation Using Query Logs in Search Engines. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 588–596. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  6. Baeza-Yates, R.: A Fast Set Intersection Algorithm for Sorted Sequences. In: 15th Combinatorial Pattern Matching 2004, Turkey. LNCS, Springer, Istanbul, Turkey (2004)

    Google Scholar 

  7. Baeza-Yates, R.: Applications of Web Query Mining. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  8. Baeza-Yates, R., Poblete, B.: A Website Mining Model Centered on User Queries. In: Berendt, B., et al. (eds.) European Web Mining Forum, Oporto, Portugal, October 2005, pp. 3–15 (2005)

    Google Scholar 

  9. Baeza-Yates, R., Pereira, A., Ziviani, N.: WIM: A Web Information Mining Model for the Web. In: LA-WEB 2005, pp. 233–241. IEEE CS Press, Los Alamitos (2005)

    Google Scholar 

  10. Bhargava, H.K., Feng, J.: Paid placement strategies for internet search engines. In: Proceedings of the eleventh international conference on World Wide Web, pp. 117–123. ACM Press, New York (2002)

    Chapter  Google Scholar 

  11. Chakrabarti, S.: Mining the Web: Discovering knowledge from hypertext data. Morgan Kaufmann, San Francisco (2003)

    Google Scholar 

  12. Davison, B.: Workshop on Adversarial Information Retrieval on the Web, Chiba, Japan (May 2005), http://airweb.cse.lehigh.edu/2005/

  13. Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1998); Preliminary version presented at SODA 1998

    Article  MathSciNet  Google Scholar 

  14. Kleinberg, J., Raghavan, P.: Query Incentive Networks. In: Proc. 46th IEEE Symposium on Foundations of Computer Science (2005)

    Google Scholar 

  15. Koster, M.: A standard for robot exclusion (1996), http://www.robotstxt.org/wc/exclusion.html

  16. Makinen, V., Navarro, G.: Compressed Full Text Indexes. Technical Report TR/DCC-, -7, Dept. of Computer Science, University of Chile (June 2005), Available at: http://pizzachili.dcc.uchile.cl/biblio.html

  17. Nicholson, S., Sierra, T., Eseryel, U.Y., Park, J.H., Barkow, P., Pozo, E.J., Ward, J.: How Much of It is Real? Analysis of Paid Placement in Web Search Engine Results. In: JASIST (2005)

    Google Scholar 

  18. Page, L., Brin, S., Motwani, R., Winograd, T.: The Pagerank citation algorithm: bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)

    Google Scholar 

  19. Ribeiro-Neto, B., Cristo, M., Golgher, P., Silva de Moura, E.: Impedance coupling in content-targeted advertising. In: Proceedings of the 28th Annual international ACM SIGIR Conference on Research and Development in information Retrieval, SIGIR 2005, Salvador, Brazil, August 15 - 19, 2005, pp. 496–503. ACM Press, New York (2005)

    Chapter  Google Scholar 

  20. Wellman, B.: Computer Networks As Social Networks. Science 293(5537), 2031–2034 (2001)

    Article  Google Scholar 

  21. Yao, A.C.-C. (ed.): WINE 2005. LNCS, vol. 3828. Springer, Heidelberg (2005), http://www.cs.cityu.edu.hk/~wine2005/

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Baeza-Yates, R. (2006). Algorithmic Challenges in Web Search Engines. In: Àlvarez, C., Serna, M. (eds) Experimental Algorithms. WEA 2006. Lecture Notes in Computer Science, vol 4007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11764298_25

Download citation

  • DOI: https://doi.org/10.1007/11764298_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34597-8

  • Online ISBN: 978-3-540-34598-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics