Abstract
Non-negative matrix factorization (NMF) has received lots of attention in research communities like document clustering, image analysis, and collaborative filtering. However, NMF-based approaches often suffer from overfitting and interdependent features which are caused by latent feature co-adaptation during the learning process. Most of the existing improved methods of NMF take advantage of side information or task-specific knowledge. However, they are not always available. Dropout has been widely recognized as a powerful strategy for preventing co-adaptation in deep neural network training. What is more, it requires no prior knowledge and brings no additional terms or transformations into the original loss function. In this paper, we introduce the dropout strategy into NMF and propose a dropout NMF algorithm. Specifically, we first design a simple dropout strategy that fuses a dropout mask in the NMF framework to prevent feature co-adaptation. Then a sequential dropout strategy is further proposed to reduce randomness and to achieve robustness. Experimental results on multiple datasets confirm that our dropout NMF methods can not only improve NMF but also further improve existing representative matrix factorization models.
Similar content being viewed by others
References
Ba LJ, Frey BJ (2013) Adaptive dropout for training deep neural networks. In: Proceedings of the 27th annual conference on neural information processing systems NIPS’13, December 5–8, 2013, Lake Tahoe, Nevada, USA, pp 3084–3092
Baldi P, Sadowski PJ (2013) Understanding dropout. In: Proceeding of the 27th annual conference on neural information processing systems NIPS’13, December 5–8, 2013. Lake Tahoe, Nevada, USA, pp 2814–2822
Brunet J-P, Tamayo P, Golub TR, Mesirov JP (2004) Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci 101(12):4164–4169
Buono ND, Pio G (2015) Non-negative matrix tri-factorization for co-clustering: an analysis of the block matrix. Inf Sci Inf Sci 301:13–26
Cai D, He X, Han J, Huang TS (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560
Chen N, Zhu J, Chen J, Zhang B (2014) Dropout training for support vector machines. In: Proceedings of the 28th AAAI conference on artificial intelligence AAAI’14, July 27–31, 2014, Quebec City, Quebec, Canada, pp 1752–1759
Cieri C, Graff D, Liberman M, Martey N, Strassel S (1999) The tdt-2 text and speech corpus. In: Proceedings of the DARPA broadcast news workshop, pp 57–60
Deerwester SC, Dumais ST, Landauer TK, Furnas GW, Harshman RA (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci JASIS 41(6):391–407
Devarajan K (2008) Nonnegative matrix factorization: an analytical and interpretive tool in computational biology. PLOS Comput Biol 4(7):e1000029
Gal Y, Ghahramani Z (2016) Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: Proceedings of the 33rd international conference on machine learning ICML’16, June 19–24, 2016, New York City, NY, USA, pp 1050–1059
Georghiades A, Belhumeur P, Kriegman D (1997) Yale face database, Center for computational vision and control at Yale University, http://cvc.yale.edu/projects/yalefaces/yalefa
Godbole S, Sarawagi S (2004) Discriminative methods for multi-labeled classification. In: Proceedings of the 8th Pacific-Asia conference on knowledge discovery and data mining PAKDD’04, May 26–28, 2004, Sydney, Australia. Springer, pp 22–30
He Z, Liu J, Liu C, Wang Y, Yin A, Huang Y (2016) Dropout non-negative matrix factorization for independent feature learning. In: Proceedings of the 5th CCF conference on natural language processing and Chinese computing, NLPCC 2016, and Proceedings of the 24th international conference on computer processing of oriental languages, ICCPOL 2016, Kunming, China, December 2–6, 2016, pp 201–212
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res JMLR 5:1457–1469
Kim H, Park H (2007) Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics 23(12):1495–1502
Lang K (1995) Newsweeder: learning to filter netnews. In: Proceedings of the 12th international conference on machine learning ICML’95, July 9–12, 1995, Tahoe City, California, USA, pp 331–339
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: Proceedings of the 14th annual conference on neural information processing systems NIPS’00, 2000, Denver, CO, USA, pp 556–562
Levy O, Goldberg Y (2014) Neural word embedding as implicit matrix factorization. In: Proceedings of the 28th annual conference on neural information processing systems NIPS’14, December 8–13 2014, Montreal, Quebec, Canada, pp 2177–2185
Li SZ, Hou X, Zhang H, Cheng Q (2001) Learning spatially localized, parts-based representation. In: Proceedings of 2001 IEEE computer society conference on computer vision and pattern recognition CVPR’01, December 8–14, 2001, Kauai, HI, USA, pp 207–212
Li T, Ding CHQ, Jordan MI (2007) Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization. In: Proceedings of the 7th IEEE international conference on data mining ICDM’07, October 28–31, 2007, Omaha, Nebraska, USA, pp 577–582
Li T, Sindhwani V, Ding CHQ, Zhang Y (2010) Bridging domains with words: opinion analysis with matrix tri-factorizations. In: Proceedings of the 2010 SIAM international conference on data mining SDM’10, April 29–May 1, 2010, Columbus, Ohio, USA, pp 293–302
Li T, Zhang Y, Sindhwani V (2009) A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge. In: Proceedings of the 47th annual meeting of the association for computational linguistics AAAI’09 and the 4th international joint conference on natural language AFNLP’09 August 2–7, 2009, Singapore, pp 244–252
Li W, Yeung D (2009) Relation regularized matrix factorization. In: Proceedings of the 21st international joint conference on artificial intelligence IJCAI’09, July 11–17, 2009, Pasadena, California, USA, pp 1126–1131
Li Y, Xu L, Tian F, Jiang L, Zhong X, Chen E (2015) Word embedding revisited: a new representation learning and explicit matrix factorization perspective. In: Proceedings of the 24th international joint conference on artificial intelligence IJCAI’2015, July 25–31, 2015, Buenos Aires, Argentina, pp 3650–3656
Lian D, Zhao C, Xie X, Sun G, Chen E, Rui Y (2014) Geomf: joint geographical modeling and matrix factorization for point-of-interest recommendation. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining KDD’14, August 24–27, 2014, New York, NY, USA, pp 831–840
Liang D, Altosaar J, Charlin L, Blei DM (2016) Factorization meets the item embedding: regularizing matrix factorization with item co-occurrence. In: Proceedings of the 10th ACM conference on recommender systems RecSys’16, September 15–19, 2016, Boston, MA, USA, ACM, New York, NY, pp 59–66
Lin C-J (2007) Projected gradient methods for nonnegative matrix factorization. Neural Comput 19(10):2756–2779
Liu W, Yuan K (2008) Sparse p-norm nonnegative matrix factorization for clustering gene expression data. IJDMB 2(3):236–249
Nene S, Nayar S, Murase H (1996) Columbia object image library (coil-20)
Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2):111–126
Pei Y, Chakraborty N, Sycara KP (2015) Nonnegative matrix tri-factorization with graph regularization for community detection in social networks. In: Proceedings of the 24th international joint conference on artificial intelligence IJCAI’15, July 25–31, 2015, Buenos Aires, Argentina, pp 2083–2089
Purushotham S, Liu Y (2012) Collaborative topic regression with social matrix factorization for recommendation systems. In: Proceedings of the 29th international conference on machine learning ICML’12, June 26–July 1, 2012, Edinburgh, Scotland, UK
Rippel O, Gelbart MA, Adams RP (2014) Learning ordered representations with nested dropout. In: Proceedings of the 31th international conference on machine learning ICML’14, 21–26 June, 2014, Beijing, China, pp 1746–1754
Shashua A, Hazan T (2005) Non-negative tensor factorization with applications to statistics and computer vision. In: Proceedings of the 22nd international conference on machine learning ICML’05, August 7–11, 2005, Bonn, Germany, pp 792–799
Srivastava GEHN, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors, arXiv preprint arXiv:1207.0580
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Sun L, Guo C (2014) Incremental affinity propagation clustering based on message passing. IEEE Trans Knowl Data Eng 26(11):2731–2744
Takeuchi K, Ishiguro K, Kimura A, Sawada H (2013) Non-negative multiple matrix factorization. In: Proceedings of the 23rd international joint conference on artificial intelligence IJCAI’13, August 3–9, 2013, Beijing, China, pp 1713–1720
Takeuchi K, Tomioka R, Ishiguro K, Kimura A, Sawada H (2013) Non-negative multiple tensor factorization. In: Proceedings of the 13th international conference on data mining ICDM’13, December 7–10, 2013, Dallas, TX, USA, pp 1199–1204
Wachsmuth E, Oram M, Perrett D (1994) Recognition of objects and their component parts: responses of single units in the temporal cortex of the macaque. Cereb Cortex 4(5):509–522
Wager S, Wang SI, Liang P (2013) Dropout training as adaptive regularization. In: Proceedings of the 27th annual conference on neural information processing systems NIPS’13, December 5–8, 2013, Lake Tahoe, Nevada, USA, pp 351–359
Wan L, Zeiler MD, Zhang S, LeCun Y, Fergus R (2013) Regularization of neural networks using dropconnect. In: Proceedings of the 30th international conference on machine learning ICML’13, June 16–21, 2013, Atlanta, GA, USA, pp 1058–1066
Wang H, Nie F, Huang H, Makedon F (2011) Fast nonnegative matrix tri-factorization for large-scale data co-clustering. In: Proceedings of the 22nd international joint conference on artificial intelligence IJCAI’11, July 16–22, 2011, Barcelona, Catalonia, Spain, pp 1553–1558
Wang SI, Manning CD (2013) Fast dropout training. In: Proceedings of the 30th international conference on machine learning ICML’13, June 16–21, 2013, Atlanta, GA, USA, pp 118–126
Xie J, Girshick RB, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: Proceedings of the 33nd international conference on machine learning, ICML 2016, New York City, NY, USA, June 19–24, 2016, pp 478–487
Yan X, Guo J, Liu S, Cheng X, Wang Y (2012) Clustering short text using ncut-weighted non-negative matrix factorization. In: Proceedings of the 21st ACM international conference on information and knowledge management CIKM’12, October 29–November 02, 2012, Maui, HI, USA, pp 2259–2262
Yang Z, Hao T, Dikmen O, Chen X, Oja E (2012) Clustering by nonnegative matrix factorization using graph random walk. In: Proceedings of the 26th annual conference on neural information processing systems NIPS’12, December 3–6, 2012, Lake Tahoe, Nevada, USA, pp 1088–1096
Yoo J, Choi S (2009) Weighted nonnegative matrix co-tri-factorization for collaborative prediction. In: Proceedings of the 1st Asian conference on machine learning ACML’09, November 2–4, 2009, Nanjing, China, pp 396–411
Zhai S, Zhang ZM (2015) Dropout training of matrix factorization and autoencoder for link prediction in sparse graphs. In: Proceedings of the 2015 SIAM international conference on data mining SDM’15, April 30–May 2, 2015 Vancouver, BC, Canada, pp 451–459
Zheng X, Zhu S, Gao J, Mamitsuka H (2015) Instance-wise weighted nonnegative matrix factorization for aggregating partitions with locally reliable clusters. In: Proceedings of the 24th international joint conference on artificial intelligence IJCAI’15, July 25–31, 2015, Buenos Aires, Argentina, pp 4091–4097
Zhuo J, Zhu J, Zhang B (2015) Adaptive dropout rates for learning with corrupted features. In: Proceedings of the 24th international joint conference on artificial intelligence IJCAI’15, July 25–31, 2015, Buenos Aires, Argentina, pp 4126–4133
Acknowledgements
This research is supported by the National Natural Science Foundation of China under Grant Nos. U1633103 and 61702367, the Science and Technology Planning Project of Tianjin under the Grant No. 17ZXRGGX00170, the Key Projects in Tianjin Science and Technology Pillar Program Under the Grant No. 17YFZCGX00610, the Research Project of Tianjin Municipal Commission of Education under the Grant No. 2017KJ033, and the Open Project Foundation of Information Technology Research Base of Civil Aviation Administration of China under Grant No. CAAC-ITRB-201701.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
He, Z., Liu, J., Liu, C. et al. Dropout non-negative matrix factorization. Knowl Inf Syst 60, 781–806 (2019). https://doi.org/10.1007/s10115-018-1259-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-018-1259-x