Skip to main content

Mixture Model and MDSDCA for Textual Data

  • Conference paper
Cooperative Design, Visualization, and Engineering (CDVE 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5738))

  • 907 Accesses

Abstract

E-mailing has become an essential component of cooperation in business. Consequently, the large number of messages manually produced or automatically generated can rapidly cause information overflow for users. Many research projects have examined this issue but surprisingly few have tackled the problem of the files attached to e-mails that, in many cases, contain a substantial part of the semantics of the message. This paper considers this specific topic and focuses on the problem of clustering and visualization of attached files. Relying on the multinomial mixture model, we used the Classification EM algorithm (CEM) to cluster the set of files, and MDSDCA to visualize the obtained classes of documents. Like the Multidimensional Scaling method, the aim of the MDSDCA algorithm based on the Difference of Convex functions is to optimize the stress criterion. As MDSDCA is iterative, we propose an initialization approach to avoid starting with random values. Experiments are investigated using simulations and textual data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Celeux, G., Govaert, G.: A classification EM algorithm for clustering and two stochastic versions. Computational Statistics and Data AnalysisĀ 14, 315ā€“332 (1992)

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  2. Govaert, G., Nadif, M.: Clustering of contingency table and mixture model. European Journal of Operational ResearchĀ 36, 1055ā€“1066 (2007)

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  3. Le Thi Hoai, A., Pham Dinh, T.: D.C. Programming Approach for Solving the Multidimensional Scaling Problem. In: Nonconvex Optimizations and Its Applications, pp. 231ā€“276. Kluwer Academic Publishers, Dordrecht (2001)

    Google ScholarĀ 

  4. Kerr, B.: Thread Arcs: An Email Thread Visualization. In: Proceedings of the IEEE Symposium on Information Visualization (2003)

    Google ScholarĀ 

  5. Otjacques, B., Feltz, F., Halin, G., Bignon, L.-C.: MatGraph: Transformation matricielle de graphe pour visualiser des Ć©changes Ć©lectroniques. In: Actes de la 17me Conference Francophone sur lā€™Interaction Homme-Machine (IHM 2005), pp. 43ā€“49 (2005)

    Google ScholarĀ 

  6. Allouti, F., Nadif, M., Otjacques, B., Le Thi, H.A.: Visualisation du parcours des fichiers attachĆ©s aux messages Ć©lectroniques. In: Proceedings of the 20th International Conference of the Association Francophone dā€™Interaction Homme-Machine (IHM 2008), vol.Ā 339, pp. 29ā€“32. ACM Publishers, New York (2008)

    Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Allouti, F., Nadif, M., Hoai An, L.T., Otjacques, B. (2009). Mixture Model and MDSDCA for Textual Data. In: Luo, Y. (eds) Cooperative Design, Visualization, and Engineering. CDVE 2009. Lecture Notes in Computer Science, vol 5738. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04265-2_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04265-2_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04264-5

  • Online ISBN: 978-3-642-04265-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics