Skip to main content

Robin: Extracting Visual and Textual Features from Web Pages

  • Conference paper
Frontiers of WWW Research and Development - APWeb 2006 (APWeb 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3841))

Included in the following conference series:

  • 616 Accesses

Abstract

Web pages contain information in several forms. These include textual information such as words and visual information such as images, use of color, and layout. We propose a method of extracting the characteristic features from both the textual and visual information in Web pages. Our method enables seamless integration of the two types of information and automatic extraction of their characteristic features. Based on this method, we developed a proof-of-concept system called Robin, which is designed to provide users with an intuitive way of browsing search engine results. The results of an experimental evaluation of the system showed that it has the potential to be practical and effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Triesman, A., Gormican, S.: Feature analysis in early vision: Evidence from search asymmetries. Psychological Review 95(1), 15–48 (1988)

    Article  Google Scholar 

  2. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)

    MATH  Google Scholar 

  3. Jolliffe, I.T.: Principal Component Analysis. Springer series in statistics (2002)

    Google Scholar 

  4. Google, http://www.google.com

  5. Thumbshot.org, http://www.thumbshots.org

  6. Theodoridis, S., Koutrounmbas, K. (eds.): Pattern Recognition. Academic Press, London (1999)

    Google Scholar 

  7. WebBrain, http://www.webbrain.com

  8. Claffy, K., Huffaker, B.: Macroscopic Internet visualization and measurement, http://www.caida.org/Tools/Mapnet/summary.html

  9. Open Directory Project, http://dmoz.org

  10. Kartoo, http://www.kartoo.com

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Oka, M., Tsukada, H., Kato, K. (2006). Robin: Extracting Visual and Textual Features from Web Pages. In: Zhou, X., Li, J., Shen, H.T., Kitsuregawa, M., Zhang, Y. (eds) Frontiers of WWW Research and Development - APWeb 2006. APWeb 2006. Lecture Notes in Computer Science, vol 3841. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11610113_71

Download citation

  • DOI: https://doi.org/10.1007/11610113_71

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31142-3

  • Online ISBN: 978-3-540-32437-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics