skip to main content
10.1145/378420.378788acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article

Controlling the robots of Web search engines

Authors Info & Claims
Published:01 June 2001Publication History

ABSTRACT

Robots are deployed by a Web search engine for collecting information from different Web servers in order to maintain the currency of its data base of Web pages. In this paper, we investigate the number of robots to be used by a search engine so as to maximize the currency of the data base without putting an unnecessary load on the network. We adopt a finite-buffer queueing model to represent the system. The arrivals to the queueing system are Web pages brought by the robots; service corresponds to the indexing of these pages. Good performance requires that the number of robots, and thus the arrival rate of the queueing system, be chosen so that the indexing queue is rarely starved or saturated. Thus, we formulate a multi-criteria stochastic optimization problem with the loss rate and empty-buffer probability being the criteria. We take the common approach of reducing the problem to one with a single objective that is a linear function of the given criteria. Both static and dynamic policies can be considered. In the static setting the number of robots is held fixed; in the dynamic setting robots may be re-activated/de-activated as a function of the state. Under the assumption that arrivals form a Poisson process and that service times are independent and exponentially distributed random variables, we determine an optimal decision rule for the dynamic setting, i.e., a rule that varies the number of robots in such a way as to minimize a given linear function of the loss rate and empty-buffer probability. Our results are compared with known results for the static case. A numerical study indicates that substantial gains can be achieved by dynamically controlling the activity of the robots.

References

  1. 1.Bertsekas, D. P., Dynamic Programming. Deterministic and Stochastic Models, Prentice-Hall, Inc., Englewood Cliffs, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.Coffman Jr., E. G., Liu, Z. and Weber, R. R., "Optimal robot scheduling for Web search engines", J. Scheduling, 1, pp. 14-22, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  3. 3.Kleinrock, L., Queueing Systems, Vol. I, Wiley & Sons, New York, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.Puterman, M. L., Markov Decision Processes, Wiley, New York, 1994.Google ScholarGoogle Scholar
  5. 5.Ross, S. M., Introduction to Stochastic Dynamic Programming, Academic Press, New York, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.J. Talim, Z. Liu, Ph. Nain, and E. G. Coffman, Jr. "Optimizing the number of robots for Web search engines", Telecommunication Systems, vol. 17, pp. 245-266, 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.Wolff, R. L., "Poisson Arrivals See Time Averages," Oper. Res., vol. 30, pp. 223-231, 1982.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Controlling the robots of Web search engines

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMETRICS '01: Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
        June 2001
        347 pages
        ISBN:1581133340
        DOI:10.1145/378420
        • Chairman:
        • Mary Vernon

        Copyright © 2001 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 June 2001

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        SIGMETRICS '01 Paper Acceptance Rate29of233submissions,12%Overall Acceptance Rate459of2,691submissions,17%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader