Abstract
With the rapid information growth on the Internet, web information collection is becoming increasingly important in many web applications, especially in search engines. The performance of web information collectors has a great influence on the quality of search engines, so when it comes to web spiders, we usually focus on their speed and accuracy. In this paper, we point out that customizability is also an important feature of a well-designed spider, which means spiders should be able to provide multi-modal services to satisfy different users with different requirements and preferences. And we have developed a parallel web spider system based on multi-agent techniques. It runs with high speed and high accuracy, and what’s the most important, it can provide its services in multiple perspectives and has good extensibility and personalized customizability.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cho, J., Garcia-Molina, H., Page, L.: Efficient Crawling Through URL Ordering. Computer Networks and ISDN Systems 30, 161–172 (1998)
Brin, S., Page, L.: The anatomy of a large-scale hypertext Web search engine. Computer Networks and ISDN Systems 30, 107–117 (1998)
Miller, R.C., Bharat, K.: SPHINX: A Framework for creating personal, site-specific Web crawlers. Computer Networks and ISDN Systems 30, 119–130 (1998)
Diligenti, M., Coetzee, F.M., Lawrence, S., et al.: Focused Crawling Using Context Graphs. In: Proceedings of the 26th VLDB Conference, Cairo, Egypt (2000)
Najork, M., Wiener, J.L.: Breadth-first search crawling yields high-quality pages. In: Proceeding of 10th International World Wide Web Conference (2001)
Dong, M., Liu, S., Zhang, H., Shi, Z.: Parallel Web Spider Based on Intelligent Agent. In: Proceedings of The 5th Pacific Rim International Workshop on Multi-Agents, Tokyo (2002)
Luo, J., Shi, Z., Wang, M., Wang, W.: Parallel Web Spiders for Cooperative Information Gathering. In: Zhuge, H., Fox, G.C. (eds.) GCC 2005. LNCS, vol. 3795, pp. 1192–1197. Springer, Heidelberg (2005)
Heydon, A., Najork, M.: Mercator: A Scalable, Extensible Web Crawler. World Wide Web 2, 219–229 (1999)
Peng, H., Lin, Z.: Search Engines and Meta Search Engines on Internet. Computer Science 29, 1–12 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
He, Q., Zhao, X., Zhang, S. (2006). Multi-modal Services for Web Information Collection Based on Multi-agent Techniques. In: Shi, ZZ., Sadananda, R. (eds) Agent Computing and Multi-Agent Systems. PRIMA 2006. Lecture Notes in Computer Science(), vol 4088. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11802372_15
Download citation
DOI: https://doi.org/10.1007/11802372_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36707-9
Online ISBN: 978-3-540-36860-1
eBook Packages: Computer ScienceComputer Science (R0)