Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5097))

Included in the following conference series:

  • 1259 Accesses

Abstract

This paper presents a new algorithm for hypertext graph crawling. Using an ant as an agent in a hypertext graph significantly limits amount of irrelevant hypertext documents which must be downloaded in order to download a given number of relevant documents. Moreover, during all time of the crawling, artificial ants do not need a queue to central control crawling process. The proposed algorithm, called the Focused Ant Crawling Algorithm, for hypertext graph crawling, is better than the Shark-Search crawling algorithm and the algorithm with best-first search strategy utilizing a queue for the central control of the crawling process.

This work was partly supported by the Foundation for Polish Science (Professorial Grant 2005-2008) and the Polish State Committee for Scientific Research (Grant N516 020 31/1977), Special Research Project 2006-2009, Polish-Singapore Research Project 2008-2010, Research Project 2008-2010.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Baldi, P., Frasconi, P., Smyth, P.: Modeling the Internet and the Web, Probabilistic Methods and Algorithms. Wiley, Chichester (2003)

    Google Scholar 

  2. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (2001)

    MATH  Google Scholar 

  3. Cortez, C., Vapnik, V.N.: The hybrid application of an inductive learning method and a neural network for intelligent information retrieval. Machine Learning 20, 1–25 (1995)

    Google Scholar 

  4. Kłopotek, A.M.: Intelligent Search Engines. EXIT (in polish) (2001)

    Google Scholar 

  5. Duch, W., Adamczak, R., Diercksen, G.H.F.: Classification, association and pattern completion using neural similarity based methods. International Journal of Applied Mathematic and Computer Science 10(4), 101–120 (2000)

    Google Scholar 

  6. Bilski, J.: The UD RLS algorithm for training feedforward neural networks. International Journal of Applied Mathematic and Computer Science 15(1), 115–123 (2005)

    MATH  Google Scholar 

  7. Łȩski, J., Henzel, N.: A neuro-fuzzy system based on logical interpretation of if-then rules. International Journal of Applied Mathematic and Computer Science 10(4), 703–722 (2000)

    Google Scholar 

  8. Łȩski, J.: A fuzzy if-then rule-based nonlinear classifier. International Journal of Applied Mathematic and Computer Science 13(2), 215–223 (2003)

    Google Scholar 

  9. Piegat, A.: Fuzzy Modeling and Control. Physica-Verlag (2001)

    Google Scholar 

  10. Rutkowska, D., Nowicki, R.: Implication-based neuro-fuzzy architectures. International Journal of Applied Mathematic and Computer Science 10(4), 675–701 (2000)

    MATH  Google Scholar 

  11. Dziwiński, P., Rutkowska, D.: Algorithm for generating fuzzy rules for WWW document classification. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 1111–1119. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Dziwiński, P., Rutkowska, D.: Hybrid algorithm for constructing DR-FIS to classification www documents. In: Some Aspects of Computer Science, EXIT Academic Publishing House, Warsaw (2007)

    Google Scholar 

  13. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice-Hall, Englewood Cliffs (1995)

    MATH  Google Scholar 

  14. Cho, J., Garcia-Molina, H., Page, L.: Efficient crawling through URL ordering. Computer Networks and ISDN Systems 30, 161–172 (1998)

    Article  Google Scholar 

  15. Baeza-Yates, R., Castillo, C., Marin, M., Rodriguez, A.: Crawling a country: Better strategies than breadth-first for web page ordering. In: International Word Wide Web Conference (2005)

    Google Scholar 

  16. Chakrabarti, S., van den Berg, M., Dom, B.: Focused crawling: a new approach to topic-specific web resource discovery. Computer Networks (31), 1623–1640 (1999)

    Article  Google Scholar 

  17. Diligenti, M., Coetzee, F.M., Lawrence, S., Giles, C.L., Gori, M.: Focused crawling using context graphs. In: 26th International Conference on Very Large Data Bases, pp. 527–534 (2000)

    Google Scholar 

  18. Davison, B.D.: Topical locality in the web. In: 23rd Ann. Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 272–279 (2000)

    Google Scholar 

  19. Rungsawang, A., Angkawattanawit, N.: Learnable topic-specific web crawler. Computer Applications 28, 97–114 (2005)

    Google Scholar 

  20. Hersovici, M., Jacovi, M., Maarek, Y., Pelleg, D., Shtalhaim, M., Ur, S.: The shark-search algorithm – an application: tailored web site mapping. In: 7th International World-Wide-Web Conference on Computer Networks, pp. 317–326 (1998)

    Google Scholar 

  21. De Bra, P., Post, R.: Information retrieval in the world wide web: making client-based searching feasible. Computer Networks and ISDN Systems 27(2), 183–192 (1994)

    Article  Google Scholar 

  22. Dorigo, M., Gambardella, L.M.: Ant colony system: A cooperative learning approach to the traveling salesman problem. IEEE Transactions on Evolutionary Computation 1(1), 53–66 (1997)

    Article  Google Scholar 

  23. Dorigo, M., Birattari, M., Stützle, T.: Ant colony optimization, artificial ants as a computational intelligence technique. IEEE Computational Intelligence Magazine, 28–39 (November 2006)

    Google Scholar 

  24. Pintea, C.M., Pop, P.C., Dumitrescu, D.: An ant-based technique for the dynamic generalized traveling salesman problem. In: 7th WSEAS International Conference on Systems Theory and Scientific Computation, vol. 7 (2007)

    Google Scholar 

  25. Vesel, A., Zerovnik, J.: How good can ants color graphs? Journal of Computing and Information Technology - CIT 8, 131–136 (2000)

    Article  Google Scholar 

  26. Dowsland, K.A., Thompson, J.M.: An improved ant colony optimisation heuristic for graph coloring, vol. 156, pp. 313–324. Elsevier Science Publishers B. V (2008)

    Google Scholar 

  27. Altshuler, Y., Bruckstein, A., Wagner, I.: Swarm robotics for a dynamic cleaning problem. In: Swarm Intelligence Symposium, SIS 2005, pp. 209–216 (2005)

    Google Scholar 

  28. Wagner, I.A., Lindenbaum, M., Bruckstein, A.M.: Distributed covering by ant-robots using evaporating traces. IEEE Transactions on Robotics and Automation 15(5) (1999)

    Google Scholar 

  29. Wagner, I.A., Lindenbaum, M., Bruckstein, A.M.: Efficiently searching a graph by a smell-oriented vertex process. Annals of Mathematics and Artificial Intelligence 24, 211–223 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  30. Birattari, M., Pellegrini, P., Dorigo, M.: On the invariance of ant colony optimization. IEEE Transactions on Evolutionary Computation 11(6) (2007)

    Google Scholar 

  31. Dorigo, M., Maniezzo, V., Colorni, A.: Ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics – Part B 26(1), 29–41 (1996)

    Article  Google Scholar 

  32. Yanowski, V., Wagner, I.A., Lindenbaum, M., Bruckstein, A.: A distributed ant algorithm for efficiently patrolling a network. Algorithmica 37, 165–186 (2003)

    Article  MathSciNet  Google Scholar 

  33. Mark, E.: Searching for information in a hypertext medical handbook. Communications of the ACM (31), 880–886 (1988)

    Article  Google Scholar 

  34. Documentation for the Java Platform, Standard Edition (2008), http://java.sun.com/javase/reference/index.jsp

Download references

Author information

Authors and Affiliations

Authors

Editor information

Leszek Rutkowski Ryszard Tadeusiewicz Lotfi A. Zadeh Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dziwiński, P., Rutkowska, D. (2008). Ant Focused Crawling Algorithm. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing – ICAISC 2008. ICAISC 2008. Lecture Notes in Computer Science(), vol 5097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69731-2_96

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69731-2_96

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69572-1

  • Online ISBN: 978-3-540-69731-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics