Skip to main content

DDOC: Overlapping Clustering of Words for Document Classification

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3246))

Abstract

In this paper we study the interest of integration of an overlapping clustering approach rather than traditional hard-clustering ones, in the context of dimensionality reduction of the description space for document classification.

The Distributional Divisive Overlapping Clustering (DDOC) method is briefly presented and compared to Agglomerative Distributional Clustering (ADC) [2] and Information-Theoretical Divisive Clustering (ITDC) [3] on the two corpus Reuters-21578 and 20Newsgroup.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Cleuziou, G., Martin, L., Clavier, L., Vrain, C.: PoBOC: an Overlapping Clustering Algorithm, Application to Rule-Based Classification and Textual Data. In: Proceedings of the 16th European Conference on Artificial Intelligence ECAI, Valencia, Spain, August 22-27 (2004) (to appear)

    Google Scholar 

  2. Baker, L.D., McCallum, A.K.: Distributional clustering of words for text classification. In: Proceedings of the 21st ACM International Conference on Research and Development in Information Retrieval, Melbourne, AU, pp. 96–103 (1998)

    Google Scholar 

  3. Dhillon, I.S., Mallela, S., Kumar, R.: A divisive information theoretic feature clustering algorithm for text classification. Journal of Machine Learning Ressources 3, 1265–1287 (2003)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cleuziou, G., Martin, L., Clavier, V., Vrain, C. (2004). DDOC: Overlapping Clustering of Words for Document Classification. In: Apostolico, A., Melucci, M. (eds) String Processing and Information Retrieval. SPIRE 2004. Lecture Notes in Computer Science, vol 3246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30213-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30213-1_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23210-0

  • Online ISBN: 978-3-540-30213-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics