Skip to main content

Nonnegative Matrix Factorization for Document Clustering: A Survey

  • Conference paper
Artificial Intelligence and Soft Computing (ICAISC 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8468))

Included in the following conference series:

Abstract

Nonnegative Matrix Factorization (NMF) is a popular dimension reduction technique of clustering by extracting latent features from high-dimensional data and is widely used for text mining. Several optimization algorithms have been developed for NMF with different cost functions. In this paper we apply several methods of NMF that have been developed for data analysis. These methods vary in using different cost function for matrix factorization and different optimization algorithms for minimizing the cost function. Reuters Document Corpus is used for evaluating the performance of each method. The methods are compared with respect to their accuracy, entropy, purity and computational complexity and residual mean square root error. The most efficient methods in terms of each performance measure are also recognized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)

    Article  Google Scholar 

  2. Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, pp. 556–562 (2000)

    Google Scholar 

  3. Hoyer, P.O.: Non-negative sparse coding. In: Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, pp. 557–565. IEEE (2002)

    Google Scholar 

  4. Shahnaz, F., Berry, M.W., Pauca, V.P., Plemmons, R.J.: Document clustering using nonnegative matrix factorization. Information Processing & Management 42(2), 373–386 (2006)

    Article  MATH  Google Scholar 

  5. Févotte, C., Bertin, N., Durrieu, J.L.: Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis. Neural computation 21(3), 793–830 (2009)

    Article  MATH  Google Scholar 

  6. Févotte, C., Idier, J.: Algorithms for nonnegative matrix factorization with the β-divergence. Neural Computation 23(9), 2421–2456 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  7. Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp. 267–273. ACM (2003)

    Google Scholar 

  8. Liu, W., Pokharel, P.P., Principe, J.C.: Correntropy: A localized similarity measure. In: International Joint Conference on Neural Networks, IJCNN 2006, pp. 4919–4924. IEEE (2006)

    Google Scholar 

  9. Zdunek, R., Cichocki, A.: Non-negative matrix factorization with quasi-newton optimization. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 870–879. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Berry, M.W., Browne, M., Langville, A.N., Pauca, V.P., Plemmons, R.J.: Algorithms and applications for approximate nonnegative matrix factorization. Computational Statistics & Data Analysis 52(1), 155–173 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  11. Pang-Ning, T., Steinbach, M., Kumar, V.: Introduction to Data Mining, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (2005)

    Google Scholar 

  12. Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5, 1457–1469 (2004)

    MATH  MathSciNet  Google Scholar 

  13. Lin, C.J.: Projected gradient methods for nonnegative matrix factorization. Neural computation 19(10), 2756–2779 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  14. Kim, H., Park, H.: Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J. Matrix Anal. Appl. 30(2), 713–730 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  15. Kim, J., Park, H.: Fast nonnegative matrix factorization: An active-set-like method and comparisons. SIAM J. Sci. Comput. 33(6), 3261–3281 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  16. Cichocki, A., Anh-Huy, P.: Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Trans. Fundamentals 92(3), 708–721 (2009)

    Article  Google Scholar 

  17. Li, L., Lebanon, G., Park, H.: Fast bregman divergence nmf using taylor expansion and coordinate descent. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 307–315. ACM (2012)

    Google Scholar 

  18. Du, L., Li, X., Shen, Y.D.: Robust nonnegative matrix factorization via half-quadratic minimization. In: ICDM, pp. 201–210 (2012)

    Google Scholar 

  19. Dhillon, I.S., Sra, S.: Generalized nonnegative matrix approximations with bregman divergences. In: NIPS, vol. 18 (2005)

    Google Scholar 

  20. Kompass, R.: A generalized divergence measure for nonnegative matrix factorization. Neural computation 19(3), 780–791 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  21. Liu, W., Pokharel, P.P., Principe, J.C.: Correntropy: properties and applications in non-gaussian signal processing. IEEE Trans. Signal Process 55(11), 5286–5298 (2007)

    Article  MathSciNet  Google Scholar 

  22. Jeong, K.H., Principe, J.C.: Enhancing the correntropy MACE filter with random projections. Neurocomputing 72(1), 102–111 (2008)

    Article  Google Scholar 

  23. Ensari, T., Chorowski, J., Zurada, J.M.: Correntropy-based document clustering via nonnegative matrix factorization. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012, Part II. LNCS, vol. 7553, pp. 347–354. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  24. Ensari, T., Chorowski, J., Zurada, J.M.: Occluded face recognition using correntropy-based nonnegative matrix factorization. In: 11th International Conference on Machine Learning and Applications (ICMLA), vol. 1, pp. 606–609. IEEE (2012)

    Google Scholar 

  25. Schmidt, M.: Matlab software (2008), http://www.di.ens.fr/~mschmidt/Software/minConf.html

  26. Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 126–135. ACM (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Hosseini-Asl, E., Zurada, J.M. (2014). Nonnegative Matrix Factorization for Document Clustering: A Survey. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2014. Lecture Notes in Computer Science(), vol 8468. Springer, Cham. https://doi.org/10.1007/978-3-319-07176-3_63

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07176-3_63

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07175-6

  • Online ISBN: 978-3-319-07176-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics