Conferences >2012 12th International Confe...

Design of a Metacrawler for web document retrieval

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Web Crawlers `browse' the World Wide Web (WWW) on behalf of search engine, to collect web pages from numerous collections of billions of documents. Metacrawler is similar...Show More

Metadata

Abstract:

Web Crawlers `browse' the World Wide Web (WWW) on behalf of search engine, to collect web pages from numerous collections of billions of documents. Metacrawler is similar to that of a meta search engine that combines the top web search results from popular search engines. World Wide Web is growing rapidly. This possesses great challenges to general purpose crawlers. This paper introduces an architectural framework of a Metacrawler. This crawler enables the user to retrieve information that is relevant to the topic from more than one traditional web search engines. The crawler works in such a way that it fetches only the pages that are relevant to the topic. The PageRank algorithm is often used in ranking web pages. But, the ranking causes the problem of topic-drift. So, modified PageRank algorithm is used to rank the retrieved web pages in such a way that it reduces this problem. The clustering method is used to combine the search results so that the user can easily select web pages from the clustered results based upon the requirement. Experimental results show the effectiveness of the Metacrawler.

Published in: 2012 12th International Conference on Intelligent Systems Design and Applications (ISDA)

Date of Conference: 27-29 November 2012

Date Added to IEEE Xplore: 24 January 2013

ISBN Information:

ISSN Information:

DOI: 10.1109/ISDA.2012.6416585

Conference Location: Kochi, India

Contents

References is not available for this document.

Design of a Metacrawler for web document retrieval

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Design of a Metacrawler for web document retrieval

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?