Abstract
Nonnegative Matrix Factorization (NMF) is a popular dimension reduction technique of clustering by extracting latent features from high-dimensional data and is widely used for text mining. Several optimization algorithms have been developed for NMF with different cost functions. In this paper we apply several methods of NMF that have been developed for data analysis. These methods vary in using different cost function for matrix factorization and different optimization algorithms for minimizing the cost function. Reuters Document Corpus is used for evaluating the performance of each method. The methods are compared with respect to their accuracy, entropy, purity and computational complexity and residual mean square root error. The most efficient methods in terms of each performance measure are also recognized.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, pp. 556–562 (2000)
Hoyer, P.O.: Non-negative sparse coding. In: Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, pp. 557–565. IEEE (2002)
Shahnaz, F., Berry, M.W., Pauca, V.P., Plemmons, R.J.: Document clustering using nonnegative matrix factorization. Information Processing & Management 42(2), 373–386 (2006)
Févotte, C., Bertin, N., Durrieu, J.L.: Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis. Neural computation 21(3), 793–830 (2009)
Févotte, C., Idier, J.: Algorithms for nonnegative matrix factorization with the β-divergence. Neural Computation 23(9), 2421–2456 (2011)
Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp. 267–273. ACM (2003)
Liu, W., Pokharel, P.P., Principe, J.C.: Correntropy: A localized similarity measure. In: International Joint Conference on Neural Networks, IJCNN 2006, pp. 4919–4924. IEEE (2006)
Zdunek, R., Cichocki, A.: Non-negative matrix factorization with quasi-newton optimization. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 870–879. Springer, Heidelberg (2006)
Berry, M.W., Browne, M., Langville, A.N., Pauca, V.P., Plemmons, R.J.: Algorithms and applications for approximate nonnegative matrix factorization. Computational Statistics & Data Analysis 52(1), 155–173 (2007)
Pang-Ning, T., Steinbach, M., Kumar, V.: Introduction to Data Mining, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (2005)
Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5, 1457–1469 (2004)
Lin, C.J.: Projected gradient methods for nonnegative matrix factorization. Neural computation 19(10), 2756–2779 (2007)
Kim, H., Park, H.: Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J. Matrix Anal. Appl. 30(2), 713–730 (2008)
Kim, J., Park, H.: Fast nonnegative matrix factorization: An active-set-like method and comparisons. SIAM J. Sci. Comput. 33(6), 3261–3281 (2011)
Cichocki, A., Anh-Huy, P.: Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Trans. Fundamentals 92(3), 708–721 (2009)
Li, L., Lebanon, G., Park, H.: Fast bregman divergence nmf using taylor expansion and coordinate descent. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 307–315. ACM (2012)
Du, L., Li, X., Shen, Y.D.: Robust nonnegative matrix factorization via half-quadratic minimization. In: ICDM, pp. 201–210 (2012)
Dhillon, I.S., Sra, S.: Generalized nonnegative matrix approximations with bregman divergences. In: NIPS, vol. 18 (2005)
Kompass, R.: A generalized divergence measure for nonnegative matrix factorization. Neural computation 19(3), 780–791 (2007)
Liu, W., Pokharel, P.P., Principe, J.C.: Correntropy: properties and applications in non-gaussian signal processing. IEEE Trans. Signal Process 55(11), 5286–5298 (2007)
Jeong, K.H., Principe, J.C.: Enhancing the correntropy MACE filter with random projections. Neurocomputing 72(1), 102–111 (2008)
Ensari, T., Chorowski, J., Zurada, J.M.: Correntropy-based document clustering via nonnegative matrix factorization. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012, Part II. LNCS, vol. 7553, pp. 347–354. Springer, Heidelberg (2012)
Ensari, T., Chorowski, J., Zurada, J.M.: Occluded face recognition using correntropy-based nonnegative matrix factorization. In: 11th International Conference on Machine Learning and Applications (ICMLA), vol. 1, pp. 606–609. IEEE (2012)
Schmidt, M.: Matlab software (2008), http://www.di.ens.fr/~mschmidt/Software/minConf.html
Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 126–135. ACM (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Hosseini-Asl, E., Zurada, J.M. (2014). Nonnegative Matrix Factorization for Document Clustering: A Survey. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2014. Lecture Notes in Computer Science(), vol 8468. Springer, Cham. https://doi.org/10.1007/978-3-319-07176-3_63
Download citation
DOI: https://doi.org/10.1007/978-3-319-07176-3_63
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07175-6
Online ISBN: 978-3-319-07176-3
eBook Packages: Computer ScienceComputer Science (R0)