Skip to main content

Document Representation Using Global Association Distance Model

  • Conference paper
Advances in Information Retrieval (ECIR 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4425))

Included in the following conference series:

Abstract

Text information processing depends critically on the proper representation of documents. Traditional models, like the vector space model, have significant limitations because they do not consider semantic relations amongst terms. In this paper we analyze a document representation using the association graph scheme and present a new approach called Global Association Distance Model (GADM). At the end, we compare GADM using K-NN classifier with the classical vector space model and the association graph model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Salton, G.: The SMART Retrieval System - Experiments in Automatic Document Processing. Prentice-Hall, Englewood Cliffs (1971)

    Google Scholar 

  2. Berry, M.: Survey of Text Mining – Clustering, Classification and Retrieval. Springer, Heidelberg (2004)

    MATH  Google Scholar 

  3. Feldman, R., Dagan, I.: Knowledge Discovery in Textual Databases (KDT). In: Proc. of the first International Conference on Data Mining and Knowledge Discovery, KDD’95, Montreal, pp. 112–117 (1995)

    Google Scholar 

  4. Kou, H., Gardarin, G.: Similarity Model and Term Association for Document Categorization. In: Andersson, B., Bergholtz, M., Johannesson, P. (eds.) NLDB 2002. LNCS, vol. 2553, pp. 223–229. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  5. Becker, J., Kuropka, D.: Topic-based Vector Space Model. In: Proc. of Business Information Systems (BIS) (2003)

    Google Scholar 

  6. Wong, S.K.M., Ziarko, W., Wong, P.C.N.: Generalized Vector Space Model in Information Retrieval. In: Proc. of the 8th Int. ACM SIGIR Conference on Research and Development in Information Retrieval, ACM Press, New York (1985)

    Google Scholar 

  7. Medina-Pagola, J.E., et al.: Similarity Measures in Documents using Association Graphs. In: Sanfeliu, A., Cortés, M.L. (eds.) CIARP 2005. LNCS, vol. 3773, pp. 741–751. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  8. Schmid, H.: Probabilistic Part-Of-Speech Tagging Using Decision Tree. In: International Conference on New Methods in Language Processing, Manchester, UK (1994)

    Google Scholar 

  9. Yang, Y.: An evaluation of statistical approaches to text categorization. Journal of Information Retrieval 1(1/2), 67–88 (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Giambattista Amati Claudio Carpineto Giovanni Romano

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Medina-Pagola, J.E., Rodríguez, A.Y., Hechavarría, A., Hernández Palancar, J. (2007). Document Representation Using Global Association Distance Model. In: Amati, G., Carpineto, C., Romano, G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71496-5_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71496-5_52

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71494-1

  • Online ISBN: 978-3-540-71496-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics