ABSTRACT
Fitting enough information from webpages to make browsing on small screens compelling is a challenging task. One approach is to present the user with a thumbnail image of the full web page and allow the user to simply press a single key to zoom into a region (which may then be transcoded into wml/xhtml, summarized, etc). However, if regions for zooming are presented naively, this yields a frustrating experience because of the number of coherent regions, sentences, images, and words that may be inadvertently separated. Here, we cast the web page segmentation problem into a machine learning framework, where we re-examine this task through the lens of entropy reduction and decision tree learning. This yields an efficient and effective page segmentation algorithm. We demonstrate how simple techniques from computer vision can be used to fine-tune the results. The resulting segmentation keeps coherent regions together when tested on a broad set of complex webpages.
- Milic-Frayling, N. and Sommerer, R. (2002) "SmartView: Enhanced Document Viewer for Mobile Devices.ö MSR-TR-2002-114 (2002).Google Scholar
- Milic-Frayling, N. and Sommerer, R., Rodden, K., Blackwell, A. (2003) "SearchMobil: Web Viewing and Search for Mobile Devicesö Proc. WWW 2003.Google Scholar
- Wobbrock, J., Forlizzi, J., Hudson, S., Myers, B. (2002) "WebThumb: Interaction Techniques for Small Screen Browsersö. Proc. 15th User Interfaces and Technology (2002). Google ScholarDigital Library
- Fumas, G. "Generalized Fisheye Viewsö (1986), CHI-86, pp. 16--23. Google ScholarDigital Library
- Hedman, A., Carr, D., & Nassla, H. (2004) "Browsing Thumbnails: A Comparison of Three Techniquesöö. Proc. 26th International Conference on Information Technology Interfaces.Google Scholar
- Cai, D., Yu, S., Wen, J.R., Ma, W.Y. (2003), "VIPS: A vision-based segmentation algorithmö. MSR-TR-2003-70. Nov. 2003.Google Scholar
- Xie, X., Mia, G., Song, R., Wen, J.R., Ma, W.Y., (2005) "Efficient Browsing of Web Search Results on Mobile Devices Based on Block Importance Modelö, 3rd IEEE Pervasive Comp. & Comm. Google ScholarDigital Library
- Berwick, B. (2003): Lecture Notes, MIT Class 6.034 AI, Recitation #9 "Nearest Neighbors + ID Treesö, Fall 2003 http://www.ai.mit.edu/courses/6.034b/recitation9.pdfGoogle Scholar
- Moore, A. (2003): "Information Gainö, Lecture Notes. http://www.autonlab.org/tutorials/Google Scholar
- Loper, E. (2003): "Decision Treesö, Lecture Notes, http://www.cis.upenn.edu/ edloper/slides/Google Scholar
- Woodruff, A., Faulring, A., Rosenholtz, R., Morrison, J., Pirolli, P. (2001) "Using Thumbnails to Search the Webö, CHI-2001. 120--127. Google ScholarDigital Library
- Lam, H., Baudisch, P. (2005) "Summary Thumbnails: Readable Overviews for Small Screen Web Browsersö, Proceedings of CHI-2005. pp. 681--690. Google ScholarDigital Library
- Bjork, S., Bretan, I., Danielsson, R., Karlgren, J. (1999), "WEST: A Web Browser for Small Terminals.ö Proc UIST'99. 187--196. Google ScholarDigital Library
- Buyukkokten, O., Gracia-Molina, H., Paepcke, and Winograd, T. (2000) "Power Browser: Efficient Web Browsing for PDAsö. In Proc. CHI 2000, pp. 430--437. Google ScholarDigital Library
- Baudisch, P., Lee, B., and Hanna, L. (2004) "Fishnet, a fisheye web browser with search term popouts: a comparative evaluation with overview and linear view.ö In Proc. AVI 2004, pp 133--140. Google ScholarDigital Library
- Chen, Y., Ma, W.Y., Zhang, H.J. (2003) "Detecting Web Page Structure for Adaptive Viewing on Small Form Factor Devicesö, Proc of the 12th Int. Conf World Wide Web. Google ScholarDigital Library
Index Terms
- Browsing on small screens: recasting web-page segmentation into an efficient machine learning framework
Recommendations
Use of RSS feeds for content adaptation in mobile web browsing
W4A '06: Proceedings of the 2006 international cross-disciplinary workshop on Web accessibility (W4A): Building the mobile web: rediscovering accessibility?While mobile phones are becoming more popular, wireless communication vendors and device manufacturers are seeking new applications for their products. Access to the large corpus of Internet information is a very prominent field, however the technical ...
Enhanced Gestalt Theory Guided Web Page Segmentation for Mobile Browsing
WI-IAT '09: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03This paper proposes a novel Web page segmentation method for mobile browsing, aiming to break a Web page into visually and semantically coherent units fitted to the limited screen size of mobile devices. We intend to simulate human’s perceptive process ...
Enhancing mobile browsing and reading
CHI EA '11: CHI '11 Extended Abstracts on Human Factors in Computing SystemsAlthough the web browser has become a standard interface for information access on the Web, the mobile web browser on the smartphone does not hold the same interest to mobile users. A survey with 11 mobile users shows that only 18% of the participants ...
Comments