skip to main content
10.1145/1135777.1135788acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
Article

Browsing on small screens: recasting web-page segmentation into an efficient machine learning framework

Published:23 May 2006Publication History

ABSTRACT

Fitting enough information from webpages to make browsing on small screens compelling is a challenging task. One approach is to present the user with a thumbnail image of the full web page and allow the user to simply press a single key to zoom into a region (which may then be transcoded into wml/xhtml, summarized, etc). However, if regions for zooming are presented naively, this yields a frustrating experience because of the number of coherent regions, sentences, images, and words that may be inadvertently separated. Here, we cast the web page segmentation problem into a machine learning framework, where we re-examine this task through the lens of entropy reduction and decision tree learning. This yields an efficient and effective page segmentation algorithm. We demonstrate how simple techniques from computer vision can be used to fine-tune the results. The resulting segmentation keeps coherent regions together when tested on a broad set of complex webpages.

References

  1. Milic-Frayling, N. and Sommerer, R. (2002) "SmartView: Enhanced Document Viewer for Mobile Devices.ö MSR-TR-2002-114 (2002).Google ScholarGoogle Scholar
  2. Milic-Frayling, N. and Sommerer, R., Rodden, K., Blackwell, A. (2003) "SearchMobil: Web Viewing and Search for Mobile Devicesö Proc. WWW 2003.Google ScholarGoogle Scholar
  3. Wobbrock, J., Forlizzi, J., Hudson, S., Myers, B. (2002) "WebThumb: Interaction Techniques for Small Screen Browsersö. Proc. 15th User Interfaces and Technology (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Fumas, G. "Generalized Fisheye Viewsö (1986), CHI-86, pp. 16--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Hedman, A., Carr, D., & Nassla, H. (2004) "Browsing Thumbnails: A Comparison of Three Techniquesöö. Proc. 26th International Conference on Information Technology Interfaces.Google ScholarGoogle Scholar
  6. Cai, D., Yu, S., Wen, J.R., Ma, W.Y. (2003), "VIPS: A vision-based segmentation algorithmö. MSR-TR-2003-70. Nov. 2003.Google ScholarGoogle Scholar
  7. Xie, X., Mia, G., Song, R., Wen, J.R., Ma, W.Y., (2005) "Efficient Browsing of Web Search Results on Mobile Devices Based on Block Importance Modelö, 3rd IEEE Pervasive Comp. & Comm. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Berwick, B. (2003): Lecture Notes, MIT Class 6.034 AI, Recitation #9 "Nearest Neighbors + ID Treesö, Fall 2003 http://www.ai.mit.edu/courses/6.034b/recitation9.pdfGoogle ScholarGoogle Scholar
  9. Moore, A. (2003): "Information Gainö, Lecture Notes. http://www.autonlab.org/tutorials/Google ScholarGoogle Scholar
  10. Loper, E. (2003): "Decision Treesö, Lecture Notes, http://www.cis.upenn.edu/ edloper/slides/Google ScholarGoogle Scholar
  11. Woodruff, A., Faulring, A., Rosenholtz, R., Morrison, J., Pirolli, P. (2001) "Using Thumbnails to Search the Webö, CHI-2001. 120--127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Lam, H., Baudisch, P. (2005) "Summary Thumbnails: Readable Overviews for Small Screen Web Browsersö, Proceedings of CHI-2005. pp. 681--690. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Bjork, S., Bretan, I., Danielsson, R., Karlgren, J. (1999), "WEST: A Web Browser for Small Terminals.ö Proc UIST'99. 187--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Buyukkokten, O., Gracia-Molina, H., Paepcke, and Winograd, T. (2000) "Power Browser: Efficient Web Browsing for PDAsö. In Proc. CHI 2000, pp. 430--437. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Baudisch, P., Lee, B., and Hanna, L. (2004) "Fishnet, a fisheye web browser with search term popouts: a comparative evaluation with overview and linear view.ö In Proc. AVI 2004, pp 133--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Chen, Y., Ma, W.Y., Zhang, H.J. (2003) "Detecting Web Page Structure for Adaptive Viewing on Small Form Factor Devicesö, Proc of the 12th Int. Conf World Wide Web. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Browsing on small screens: recasting web-page segmentation into an efficient machine learning framework

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WWW '06: Proceedings of the 15th international conference on World Wide Web
      May 2006
      1102 pages
      ISBN:1595933239
      DOI:10.1145/1135777

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 May 2006

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

      Upcoming Conference

      WWW '24
      The ACM Web Conference 2024
      May 13 - 17, 2024
      Singapore , Singapore

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader