Reference Hub3
Supporting Text Retrieval by Typographical Term Weighting

Supporting Text Retrieval by Typographical Term Weighting

Lars Werner, Stefan Böttcher
Copyright: © 2007 |Volume: 3 |Issue: 2 |Pages: 16
ISSN: 1548-3657|EISSN: 1548-3665|ISSN: 1548-3657|EISBN13: 9781615203673|EISSN: 1548-3665|DOI: 10.4018/jiit.2007040101
Cite Article Cite Article

MLA

Werner, Lars, and Stefan Böttcher. "Supporting Text Retrieval by Typographical Term Weighting." IJIIT vol.3, no.2 2007: pp.1-16. http://doi.org/10.4018/jiit.2007040101

APA

Werner, L. & Böttcher, S. (2007). Supporting Text Retrieval by Typographical Term Weighting. International Journal of Intelligent Information Technologies (IJIIT), 3(2), 1-16. http://doi.org/10.4018/jiit.2007040101

Chicago

Werner, Lars, and Stefan Böttcher. "Supporting Text Retrieval by Typographical Term Weighting," International Journal of Intelligent Information Technologies (IJIIT) 3, no.2: 1-16. http://doi.org/10.4018/jiit.2007040101

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

Text documents stored in information systems usually consist of more information than the pure concatenation of words, i.e., they also contain typographic information. Because conventional text retrieval methods evaluate only the word frequency, they miss the in-formation provided by typography, e.g., regarding the importance of certain terms. In order to overcome this weakness, we present an approach which uses the typographical information of text documents and show how this improves the efficiency of text retrieval methods. Our approach uses weighting of typographic information in addition to term frequencies for separating relevant information in text documents from the noise. We have evaluated our approach on the basis of automated text classification algorithms. The results show that our weighting approach achieves very competitive classification results using at most 30% of the terms used by conventional approaches, which makes our approach significantly more efficient.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.