Abstract
In this paper, we propose an automatic summarization system to ease web browsing for visually impaired people on handheld devices. In particular, we propose a new architecture for summarizing Semantic Textual Units [2] based on efficient algorithms for linguistic treatment [3][6] which allow real-time processing and deeper linguistic analysis of web pages, thus allowing quality content visualization. Moreover, we present a text-to-speech interface to ease the understanding of web pages content. To our knowledge, this is the first attempt to use both statistical and linguistic techniques for text summarization for browsing on mobile devices.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Berger, A., Mittal, V.: Ocelot: a System for Summarizing Web Pages. In: Proc. of SIGIR (2000)
Buyukkokten, O., Garcia-Molina, H., Paepcke, A.: Seeing the Whole in Parts: Text Summarization for Web Browsing on Handheld Devices. In: Proc. of the 10th International World Wide Web Conference (2000)
Brants, T.: TnT - a Statistical Part-of-Speech Tagger. In: Proc. of the 6th Applied NLP Conference (2000)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1-7) (1998)
Dolan, W.B., Quirk, C., Brockett, C.: Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources. In: Proc. of COLING (2004)
Gil, A., Dias, G.: Using Masks, Suffix Array-based Data Structures and Multidimensional Arrays to Compute Positional Ngram Statistics from Corpora. In: Proc. of the Workshop on Multiword Expressions of the 41st ACL (2003)
Gomes, P., et al.: Web-Clipping: Compression Heuristics for Displaying Text on a PDA. In: Proc. of Workshop on HCI with Mobile Devices (2001)
Justeson, J., Katz, S.: Technical Terminology: some Linguistic Properties and an Algorithm for Identification in text. Natural Language Engineering (1) (1995)
Luhn, H.P.: The automatic creation of literature abstracts. IBM Journal of Research and Development (1958)
Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Texts. In: Proc. of the Conference on Empirical Methods in Natural Language Processing (2004)
Salton, G., Yang, C.S., Yu, C.T.: A Theory of Term Importance in Automatic Text Analysis. Amer. Soc. of Inf. Science 1(26) (1975)
Stolcke, A.: SRILM – An Extensible Language Modeling Toolkit. In: Proc. of International Conference on Spoken Language Processing (2002)
Vechtomova, O., Karamuftuoglu, M.: Comparison of Two Interactive Search Refinement Techniques. In: Proc. of HLT-NAACL (2004)
Yang, C., Wang, F.L.: Fractal Summarization for Mobile Devices to Access Large Documents on the Web. In: Proc. of the International World Wide Web Conference (2003)
Zhang, Y., Zincir-Heywood, N., Milios, E.: Summarizing Web Sites Automatically. In: Proc. Conference of Canadian Society for Computational Studies of Intelligence (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dias, G., Conde, B. (2007). Accessing the Web on Handheld Devices for Visually Impaired People. In: Wegrzyn-Wolska, K.M., Szczepaniak, P.S. (eds) Advances in Intelligent Web Mastering. Advances in Soft Computing, vol 43. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72575-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-540-72575-6_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72574-9
Online ISBN: 978-3-540-72575-6
eBook Packages: EngineeringEngineering (R0)